skip to main content


Title: Cross-Linguistic Perceptual Categorization of the Three Corner Vowels: Effects of Listener Language and Talker Age

The present study examined the center and size of naïve adult listeners’ vowel perceptual space (VPS) in relation to listener language (LL) and talker age (TA). Adult listeners of three different first languages, American English, Greek, and Korean, categorized and rated the goodness of different vowels produced by 2-year-olds and 5-year-olds and adult speakers of those languages, and speakers of Cantonese and Japanese. The center (i.e., mean first and second formant frequencies (F1 and F2)) and size (i.e., area in the F1/F2 space) of VPSs that were categorized either into /a/, /i/, or /u/ were calculated for each LL and TA group. All center and size calculations were weighted by the goodness rating of each stimulus. The F1 and F2 values of the vowel category (VC) centers differed significantly by LL and TA. These effects were qualitatively different for the three vowel categories: English listeners had different /a/ and /u/ centers than Greek and Korean listeners. The size of VPSs did not differ significantly by LL, but did differ by TA and VCs: Greek and Korean listeners had larger vowel spaces when perceiving vowels produced by 2-year-olds than by 5-year-olds or adults, and English listeners had larger vowel spaces for /a/ than /i/ or /u/. Findings indicate that vowel perceptual categories of listeners varied by the nature of their native vowel system, and were sensitive to TA.

 
more » « less
PAR ID:
10546957
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
SAGE Publications
Date Published:
Journal Name:
Language and Speech
Volume:
64
Issue:
3
ISSN:
0023-8309
Format(s):
Medium: X Size: p. 558-575
Size(s):
p. 558-575
Sponsoring Org:
National Science Foundation
More Like this
  1. This study examines apparent-time variation in the use of multiple acoustic cues present on coarticulatorily nasalized vowels in California English. Eighty-nine listeners ranging in age from 18-58 (grouped into 3 apparent-time categories based on year of birth) performed lexical identifications on syllables excised from words with oral and nasal codas from six speakers who produced either minimal (n=3) or extensive (n=3) anticipatory nasal coarticulation (realized by greater vowel nasalization, F1 bandwidth, and diphthongization on vowels in CVN contexts). Results showed no differences across listeners’ identification for Extensively coarticulated vowels, as well as oral vowels by both types of speakers (all at-ceiling). Yet, performance for the Minimal Coarticulators’ nasalized vowels was lowest for the older listener group and increased over apparent-time. Perceptual cue-weighting analyses revealed that older listeners rely more on F1 bandwidth, while younger listeners rely more on acoustic nasality, as coarticulatory cues providing information about lexical identity. Thus, there is evidence for variation in apparent- time in the use of the different coarticulatory cues present on vowels. Younger listeners’ cue weighting allows them flexibility to identify lexical items given a range of coarticulatory variation across (here, younger) speakers, while older listeners’ cue weighting leads to reduced performance for talkers producing innovative phonetic forms. This study contributes to our understanding of the relationship between multidimensional acoustic features resulting from coarticulation and the perceptual re-weighting of cues that can lead to sound change over time.

     
    more » « less
  2. This study investigates whether short-term perceptual training can enhance Seoul-Korean listeners’ use of English lexical stress in spoken word recognition. Unlike English, Seoul Korean does not have lexical stress (or lexical pitch accents/tones). Seoul-Korean speakers at a high-intermediate English proficiency completed a visual-world eye-tracking experiment adapted from Connell et al. (2018) (pre-/post-test). The experiment tested whether pitch in the target stimulus (accented versus unaccented first syllable) and vowel quality in the lexical competitor (reduced versus full first vowel) modulated fixations to the target word (e.g., PARrot; ARson) over the competitor word (e.g., paRADE or PARish; arCHIVE or ARcade). In the training (eight 30-min sessions over eight days), participants heard English lexical-stress minimal pairs uttered by four talkers (high variability) or one talker (low variability), categorized them as noun (first-syllable stress) or verb (second-syllable stress), and received accuracy feedback. The results showed that neither training increased target-over-competitor fixation proportions. Crucially, the same training had been found to improve Seoul- Korean listeners’ recall of English words differing in lexical stress (Tremblay et al., 2022) and their weighting of acoustic cues to English lexical stress (Tremblay et al., 2023). These results suggest that short-term perceptual training has a limited effect on target-over-competitor word activation.

     
    more » « less
  3. To test the hypothesis that intraspeaker variation in vowel formants is related to the direction of diachronic change, we compare the direction of change in apparent time with the axis of intraspeaker variation in F1 and F2 for vowel phonemes in several corpora of North American and Scottish English. These vowels were measured automatically with a scheme (tested on hand-measured vowels) that considers the frequency, bandwidth, and amplitude of the first three formants in reference to a prototype. In the corpus data, we find that the axis of intraspeaker variation is typically aligned vertically, presumably corresponding to the degree of jaw opening for individual tokens, but for the North American GOOSE vowel, the axis of intraspeaker variation is aligned with the (horizontal) axis of diachronic change for this vowel across North America. This may help to explain why fronting and unrounding of high back vowels are common shifts across languages. 
    more » « less
  4. Purpose:

    This study examined the race identification of Southern American English speakers from two geographically distant regions in North Carolina. The purpose of this work is to explore how talkers' self-identified race, talker dialect region, and acoustic speech variables contribute to listener categorization of talker races.

    Method:

    Two groups of listeners heard a series of /h/–vowel–/d/ (/hVd/) words produced by Black and White talkers from East and West North Carolina, respectively.

    Results:

    Both Southern (North Carolina) and Midland (Indiana) listeners accurately categorized the race of all speakers with greater-than-chance accuracy; however, Western North Carolina Black talkers were categorized with the lowest accuracy, just above chance.

    Conclusions:

    The results suggest that similarities in the speech production patterns of West North Carolina Black and White talkers affect the racial categorization of Black, but not White talkers. The results are discussed with respect to the acoustic spectral features of the voices present in the sample population.

     
    more » « less
  5. Previous studies have shown that non-native speakers of Korean not only have difficulty producing the word-initial three-way stop contrast, but also exhibit a wide range of production patterns. Because these studies have only investigated native (L1) speakers of English and Mandarin and given the overall paucity of research on non-native Korean, it is not yet clear how dependent these findings are on the particular native language under investigation. The current paper reinforces our empirical grounding via extension to L1 speakers of Japanese. It is shown that although naïve Japanese listeners consistently perceive Korean fortis stops as voiced, and Korean lenis and aspirated stops as voiceless, novice second language learners do not produce any significant difference among the three stop categories, despite producing clear differences between their native Japanese stop categories. Unlike in previous studies of L1 speakers of English and Mandarin, there was very little inter-speaker variation, and all speakers produced all Korean stops with long lag voice onset time.

     
    more » « less