skip to main content

Title: Early L2 Spoken Word Recognition Combines Input-Based and Knowledge-Based Processing

This study examines the perceptual trade-off between knowledge of a language’s statistical regularities and reliance on the acoustic signal during L2 spoken word recognition. We test how early learners track and make use of segmental and suprasegmental cues and their relative frequencies during non-native word recognition. English learners of Mandarin were taught an artificial tonal language in which a tone’s informativeness for word identification varied according to neighborhood density. The stimuli mimicked Mandarin’s uneven distribution of syllable+tone combinations by varying syllable frequency and the probability of particular tones co-occurring with a particular syllable. Use of statistical regularities was measured by four-alternative forced-choice judgments and by eye fixations to target and competitor symbols. Half of the participants were trained on one speaker, that is, low speaker variability while the other half were trained on four speakers. After four days of learning, the results confirmed that tones are processed according to their informativeness. Eye movements to the newly learned symbols demonstrated that L2 learners use tonal probabilities at an early stage of word recognition, regardless of speaker variability. The amount of variability in the signal, however, influenced the time course of recovery from incorrect anticipatory looks: participants exposed to low speaker variability recovered from incorrect probability-based predictions of tone more rapidly than participants exposed to greater variability. These results motivate two conclusions: early L2 learners track the distribution of segmental and suprasegmental co-occurrences and make predictions accordingly during spoken word recognition; and when the acoustic input is more variable because of multi-speaker input, listeners rely more on their knowledge of tone-syllable co-occurrence frequency distributions and less on the incoming acoustic signal.

more » « less
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
SAGE Publications
Date Published:
Journal Name:
Language and Speech
Medium: X Size: p. 632-656
p. 632-656
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Successful listening in a second language (L2) involves learning to identify the relevant acoustic–phonetic dimensions that differentiate between words in the L2, and then use these cues to access lexical representations during real-time comprehension. This is a particularly challenging goal to achieve when the relevant acoustic–phonetic dimensions in the L2 differ from those in the L1, as is the case for the L2 acquisition of Mandarin, a tonal language, by speakers of non-tonal languages like English. Previous work shows tone in L2 is perceived less categorically (Shen and Froud, 2019) and weighted less in word recognition (Pelzl et al., 2019) than in L1. However, little is known about the link between categorical perception of tone and use of tone in real time L2 word recognition at the level of the individual learner. This study presents evidence from 30 native and 29 L1-English speakers of Mandarin who completed a real-time spoken word recognition and a tone identification task. Results show that L2 learners differed from native speakers in both the extent to which they perceived tone categorically as well as in their ability to use tonal cues to distinguish between words in real-time comprehension. Critically, learners who reliably distinguished between words differing by tone alone in the word recognition task also showed more categorical perception of tone on the identification task. Moreover, within this group, performance on the two tasks was strongly correlated. This provides the first direct evidence showing that the ability to perceive tone categorically is related to the weighting of tonal cues during spoken word recognition, thus contributing to a better understanding of the link between phonemic and lexical processing, which has been argued to be a key component in the L2 acquisition of tone (Wong and Perrachione, 2007). 
    more » « less
  2. null (Ed.)
    Abstract Lexical tones are widely believed to be a formidable learning challenge for adult speakers of nontonal languages. While difficulties—as well as rapid improvements—are well documented for beginning second language (L2) learners, research with more advanced learners is needed to understand how tone perception difficulties impact word recognition once learners have a substantial vocabulary. The present study narrows in on difficulties suggested in previous work, which found a dissociation in advanced L2 learners between highly accurate tone identification and largely inaccurate lexical decision for tone words. We investigate a “best-case scenario” for advanced L2 tone word processing by testing performance in nearly ideal listening conditions—with words spoken clearly and in isolation. Under such conditions, do learners still have difficulty in lexical decision for tone words? If so, is it driven by the quality of lexical representations or by L2 processing routines? Advanced L2 and native Chinese listeners made lexical decisions while an electroencephalogram was recorded. Nonwords had a first syllable with either a vowel or tone that differed from that of a common disyllabic word. As a group, L2 learners performed less accurately when tones were manipulated than when vowels were manipulated. Subsequent analyses showed that this was the case even in the subset of items for which learners showed correct and confident tone identification in an offline written vocabulary test. Event-related potential results indicated N400 effects for both nonword conditions in L1, but only vowel N400 effects in L2, with tone responses intermediate between those of real words and vowel nonwords. These results are evidence of the persistent difficulty most L2 learners have in using tones for online word recognition, and indicate it is driven by a confluence of factors related to both L2 lexical representations and processing routines. We suggest that this tone nonword difficulty has real-world implications for learners: It may result in many toneless word representations in their mental lexicons, and is likely to affect the efficiency with which they can learn new tone words. 
    more » « less
  3. Abstract

    A notoriously contested subarea of phonological typology is word-prosodic typology, which governs suprasegmental structure (such as tone, syllable structure and stress) at the word level. Within word-prosodic typology, it is widely recognized that some languages have so-called stress systems while others have lexical-tone systems. Other languages appear to have intermediate systems, with properties of both stress and lexically contrastive tone. Certain types of such intermediate systems are at the core of ongoing theoretical debates on the nature of word- prosodic systems, viz. language varieties with contrasts between two word tones that are restricted to the main-stressed syllables of a word, a phenomenon that is often descriptively referred to as tonal accent. In this paper, we aim to show that exploring tone-accent systems in detail has the potential to significantly contribute to word-prosodic typology, specifically concerning the foot as a tool for the analysis of syllable-internal prosodic contrasts. The phonology of tonal accent in Franconian (a variety of West Germanic spoken in parts of Belgium, Germany, and the Netherlands) will be the main piece of evidence supporting our claims, with a focus on predictable interactions between segmental structure and accentuation. A central implication of our analysis is that tonal contrasts within syllables can sometimes derive from two types of feet being active in the same prosodic system. We support the Franconian evidence with analogous tone-segment interactions in Estonian and discuss the relevance of our claims in the broader context of word-prosodic typology.

    more » « less
  4. A notoriously contested subarea of phonological typology is word-prosodic typology, which governs suprasegmental structure (such as tone, syllable structure and stress) at the word level. Within word-prosodic typology, it is widely recognized that some languages have so-called stress systems while others have lexical-tone systems. Other languages appear to have intermediate systems, with properties of both stress and lexically contrastive tone. Certain types of such intermediate systems are at the core of ongoing theoretical debates on the nature of word- prosodic systems, viz. language varieties with contrasts between two word tones that are restricted to the main-stressed syllables of a word, a phenomenon that is often descriptively referred to as tonal accent. In this paper, we aim to show that exploring tone-accent systems in detail has the potential to significantly contribute to word-prosodic typology, specifically concerning the foot as a tool for the analysis of syllable-internal prosodic contrasts. The phonology of tonal accent in Franconian (a variety of West Germanic spoken in parts of Belgium, Germany, and the Netherlands) will be the main piece of evidence supporting our claims, with a focus on predictable interactions between segmental structure and accentuation. A central implication of our analysis is that tonal contrasts within syllables can sometimes derive from two types of feet being active in the same prosodic system. We support the Franconian evidence with analogous tone-segment interactions in Estonian and discuss the relevance of our claims in the broader context of word-prosodic typology. 
    more » « less
  5. null (Ed.)
    People who grow up speaking a language without lexical tones typically find it difficult to master tonal languages after childhood. Accumulating research suggests that much of the challenge for these second language (L2) speakers has to do not with identification of the tones themselves, but with the bindings between tones and lexical units. The question that remains open is how much of these lexical binding problems are problems of encoding (incomplete knowledge of the tone-to-word relations) vs. retrieval (failure to access those relations in online processing). While recent work using lexical decision tasks suggests that both may play a role, one issue is that failure on a lexical decision task may reflect a lack of learner confidence about what is not a word, rather than non-native representation or processing of known words. Here we provide complementary evidence using a picture- phonology matching paradigm in Mandarin in which participants decide whether or not a spoken target matches a specific image, with concurrent event-related potential (ERP) recording to provide potential insight into differences in L1 and L2 tone processing strategies. As in the lexical decision case, we find that advanced L2 learners show a clear disadvantage in accurately identifying tone mismatched targets relative to vowel mismatched targets. We explore the contribution of incomplete/uncertain lexical knowledge to this performance disadvantage by examining individual data from an explicit tone knowledge post-test. Results suggest that explicit tone word knowledge and confidence explains some but not all of the errors in picture-phonology matching. Analysis of ERPs from correct trials shows some differences in the strength of L1 and L2 responses, but does not provide clear evidence toward differences in processing that could explain the L2 disadvantage for tones. In sum, these results converge with previous evidence from lexical decision tasks in showing that advanced L2 listeners continue to have difficulties with lexical tone recognition, and in suggesting that these difficulties reflect problems both in encoding lexical tone knowledge and in retrieving that knowledge in real time. 
    more » « less