skip to main content


This content will become publicly available on January 7, 2025

Title: Data‐Driven Learning for Pronunciation: Perception and Production of Lexical Stress and Prominence in Academic English
Abstract

Issues of intelligibility may arise amongst English learners when acquiring new words and phrases in North American academic settings, perhaps in part due to limited linguistic data available to the learner for understanding language use patterns. To this end, this paper examines the effects of Data‐Driven Learning for Pronunciation (DDLfP) on lexical stress and prominence in the US academic context. 65 L2 English learners in North American universities completed a diagnostic and pretest with listening and speaking items before completing four online lessons and a posttest on academic words and formulas (i.e., multi‐word sequences). Experimental group participants (n = 40) practiced using an audio corpus of highly proficient L2 speakers while comparison group participants (n = 25) were given teacher‐created pronunciation materials. Logistic regression results indicated that the group who used the corpus significantly increased their recognition of prominence in academic formulas. In the spoken tasks, both groups improved in their lexical stress pronunciation, but only the DDLfP learners improved their production of prominence in academic formulas. Learners reported that they valued DDLfP efforts for pronunciation learning across contexts and speakers. Findings have implications for teachers of L2 pronunciation and support the use of corpora for language teaching and learning.

 
more » « less
Award ID(s):
2140469
PAR ID:
10484857
Author(s) / Creator(s):
 ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
TESOL Quarterly
Volume:
58
Issue:
3
ISSN:
0039-8322
Format(s):
Medium: X Size: p. 1081-1107
Size(s):
p. 1081-1107
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Successful listening in a second language (L2) involves learning to identify the relevant acoustic–phonetic dimensions that differentiate between words in the L2, and then use these cues to access lexical representations during real-time comprehension. This is a particularly challenging goal to achieve when the relevant acoustic–phonetic dimensions in the L2 differ from those in the L1, as is the case for the L2 acquisition of Mandarin, a tonal language, by speakers of non-tonal languages like English. Previous work shows tone in L2 is perceived less categorically (Shen and Froud, 2019) and weighted less in word recognition (Pelzl et al., 2019) than in L1. However, little is known about the link between categorical perception of tone and use of tone in real time L2 word recognition at the level of the individual learner. This study presents evidence from 30 native and 29 L1-English speakers of Mandarin who completed a real-time spoken word recognition and a tone identification task. Results show that L2 learners differed from native speakers in both the extent to which they perceived tone categorically as well as in their ability to use tonal cues to distinguish between words in real-time comprehension. Critically, learners who reliably distinguished between words differing by tone alone in the word recognition task also showed more categorical perception of tone on the identification task. Moreover, within this group, performance on the two tasks was strongly correlated. This provides the first direct evidence showing that the ability to perceive tone categorically is related to the weighting of tonal cues during spoken word recognition, thus contributing to a better understanding of the link between phonemic and lexical processing, which has been argued to be a key component in the L2 acquisition of tone (Wong and Perrachione, 2007). 
    more » « less
  2. Abstract

    We present CELER (Corpus of Eye Movements in L1 and L2 English Reading), a broad coverage eye-tracking corpus for English. CELER comprises over 320,000 words, and eye-tracking data from 365 participants. Sixty-nine participants are L1 (first language) speakers, and 296 are L2 (second language) speakers from a wide range of English proficiency levels and five different native language backgrounds. As such, CELER has an order of magnitude more L2 participants than any currently available eye movements dataset with L2 readers. Each participant in CELER reads 156 newswire sentences from the Wall Street Journal (WSJ), in a new experimental design where half of the sentences are shared across participants and half are unique to each participant. We provide analyses that compare L1 and L2 participants with respect to standard reading time measures, as well as the effects of frequency, surprisal, and word length on reading times. These analyses validate the corpus and demonstrate some of its strengths. We envision CELER to enable new types of research on language processing and acquisition, and to facilitate interactions between psycholinguistics and natural language processing (NLP).

     
    more » « less
  3. This study examines whether second language (L2) learners' processing of an intonationally cued lexical contrast is facilitated when intonational cues signal a segmental contrast in the native language (L1). It does so by investigating Seoul Korean and French listeners' processing of intonationally cued lexical-stress contrasts in English. Neither Seoul Korean nor French has lexical stress; instead, the two languages have similar intonational systems where prominence is realized at the level of the Accentual Phrase. A critical difference between the two systems is that French has only one tonal pattern underlying the realization of the Accentual Phrase, whereas Korean has two underlying tonal patterns that depend on the laryngeal feature of the phrase-initial segment. The L and H tonal cues thus serve to distinguish segments at the lexical level in Korean but not in French; Seoul Korean listeners are thus hypothesized to outperform French listeners when processing English lexical stress realized only with (only) tonal cues (H * on the stressed syllable). Seoul Korean and French listeners completed a sequence-recall task with four-item sequences of English words that differed in intonationally cued lexical stress (experimental condition) or in word-initial segment (control condition). The results showed higher accuracy for Seoul Korean listeners than for French listeners only when processing English lexical stress, suggesting that the processing of an intonationally cued lexical contrast in the L2 is facilitated when intonational cues signal a segmental contrast in the L1. These results are interpreted within the scope of the cue-based transfer approach to L2 prosodic processing. 
    more » « less
  4. null (Ed.)
    Abstract Lexical tones are widely believed to be a formidable learning challenge for adult speakers of nontonal languages. While difficulties—as well as rapid improvements—are well documented for beginning second language (L2) learners, research with more advanced learners is needed to understand how tone perception difficulties impact word recognition once learners have a substantial vocabulary. The present study narrows in on difficulties suggested in previous work, which found a dissociation in advanced L2 learners between highly accurate tone identification and largely inaccurate lexical decision for tone words. We investigate a “best-case scenario” for advanced L2 tone word processing by testing performance in nearly ideal listening conditions—with words spoken clearly and in isolation. Under such conditions, do learners still have difficulty in lexical decision for tone words? If so, is it driven by the quality of lexical representations or by L2 processing routines? Advanced L2 and native Chinese listeners made lexical decisions while an electroencephalogram was recorded. Nonwords had a first syllable with either a vowel or tone that differed from that of a common disyllabic word. As a group, L2 learners performed less accurately when tones were manipulated than when vowels were manipulated. Subsequent analyses showed that this was the case even in the subset of items for which learners showed correct and confident tone identification in an offline written vocabulary test. Event-related potential results indicated N400 effects for both nonword conditions in L1, but only vowel N400 effects in L2, with tone responses intermediate between those of real words and vowel nonwords. These results are evidence of the persistent difficulty most L2 learners have in using tones for online word recognition, and indicate it is driven by a confluence of factors related to both L2 lexical representations and processing routines. We suggest that this tone nonword difficulty has real-world implications for learners: It may result in many toneless word representations in their mental lexicons, and is likely to affect the efficiency with which they can learn new tone words. 
    more » « less
  5. N/A (Ed.)
    Automatic pronunciation assessment (APA) plays an important role in providing feedback for self-directed language learners in computer-assisted pronunciation training (CAPT). Several mispronunciation detection and diagnosis (MDD) systems have achieved promising performance based on end-to-end phoneme recognition. However, assessing the intelligibility of second language (L2) remains a challenging problem. One issue is the lack of large-scale labeled speech data from non-native speakers. Additionally, relying only on one aspect (e.g., accuracy) at a phonetic level may not provide a sufficient assessment of pronunciation quality and L2 intelligibility. It is possible to leverage segmental/phonetic-level features such as goodness of pronunciation (GOP), however, feature granularity may cause a discrepancy in prosodic-level (suprasegmental) pronunciation assessment. In this study, Wav2vec 2.0-based MDD and Goodness Of Pronunciation feature-based Transformer are employed to characterize L2 intelligibility. Here, an L2 speech dataset, with human-annotated prosodic (suprasegmental) labels, is used for multi-granular and multi-aspect pronunciation assessment and identification of factors important for intelligibility in L2 English speech. The study provides a transformative comparative assessment of automated pronunciation scores versus the relationship between suprasegmental features and listener perceptions, which taken collectively can help support the development of instantaneous assessment tools and solutions for L2 learners. 
    more » « less