skip to main content


Title: A phonetic model of non-native spoken word processing
Non-native speakers show difficulties with spoken word processing. Many studies attribute these difficulties to imprecise phonological encoding of words in the lexical memory. We test an alternative hypothesis: that some of these difficulties can arise from the non-native speakers' phonetic perception. We train a computational model of phonetic learning, which has no access to phonology, on either one or two languages. We first show that the model exhibits predictable behaviors on phone-level and word-level discrimination tasks. We then test the model on a spoken word processing task, showing that phonology may not be necessary to explain some of the word processing effects observed in non-native speakers. We run an additional analysis of the model's lexical representation space, showing that the two training languages are not fully separated in that space, similarly to the languages of a bilingual human speaker.  more » « less
Award ID(s):
1734245
NSF-PAR ID:
10226921
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics
Page Range / eLocation ID:
1480-1490
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    People who grow up speaking a language without lexical tones typically find it difficult to master tonal languages after childhood. Accumulating research suggests that much of the challenge for these second language (L2) speakers has to do not with identification of the tones themselves, but with the bindings between tones and lexical units. The question that remains open is how much of these lexical binding problems are problems of encoding (incomplete knowledge of the tone-to-word relations) vs. retrieval (failure to access those relations in online processing). While recent work using lexical decision tasks suggests that both may play a role, one issue is that failure on a lexical decision task may reflect a lack of learner confidence about what is not a word, rather than non-native representation or processing of known words. Here we provide complementary evidence using a picture- phonology matching paradigm in Mandarin in which participants decide whether or not a spoken target matches a specific image, with concurrent event-related potential (ERP) recording to provide potential insight into differences in L1 and L2 tone processing strategies. As in the lexical decision case, we find that advanced L2 learners show a clear disadvantage in accurately identifying tone mismatched targets relative to vowel mismatched targets. We explore the contribution of incomplete/uncertain lexical knowledge to this performance disadvantage by examining individual data from an explicit tone knowledge post-test. Results suggest that explicit tone word knowledge and confidence explains some but not all of the errors in picture-phonology matching. Analysis of ERPs from correct trials shows some differences in the strength of L1 and L2 responses, but does not provide clear evidence toward differences in processing that could explain the L2 disadvantage for tones. In sum, these results converge with previous evidence from lexical decision tasks in showing that advanced L2 listeners continue to have difficulties with lexical tone recognition, and in suggesting that these difficulties reflect problems both in encoding lexical tone knowledge and in retrieving that knowledge in real time. 
    more » « less
  2. null (Ed.)
    Successful listening in a second language (L2) involves learning to identify the relevant acoustic–phonetic dimensions that differentiate between words in the L2, and then use these cues to access lexical representations during real-time comprehension. This is a particularly challenging goal to achieve when the relevant acoustic–phonetic dimensions in the L2 differ from those in the L1, as is the case for the L2 acquisition of Mandarin, a tonal language, by speakers of non-tonal languages like English. Previous work shows tone in L2 is perceived less categorically (Shen and Froud, 2019) and weighted less in word recognition (Pelzl et al., 2019) than in L1. However, little is known about the link between categorical perception of tone and use of tone in real time L2 word recognition at the level of the individual learner. This study presents evidence from 30 native and 29 L1-English speakers of Mandarin who completed a real-time spoken word recognition and a tone identification task. Results show that L2 learners differed from native speakers in both the extent to which they perceived tone categorically as well as in their ability to use tonal cues to distinguish between words in real-time comprehension. Critically, learners who reliably distinguished between words differing by tone alone in the word recognition task also showed more categorical perception of tone on the identification task. Moreover, within this group, performance on the two tasks was strongly correlated. This provides the first direct evidence showing that the ability to perceive tone categorically is related to the weighting of tonal cues during spoken word recognition, thus contributing to a better understanding of the link between phonemic and lexical processing, which has been argued to be a key component in the L2 acquisition of tone (Wong and Perrachione, 2007). 
    more » « less
  3. null (Ed.)
    Abstract Lexical tones are widely believed to be a formidable learning challenge for adult speakers of nontonal languages. While difficulties—as well as rapid improvements—are well documented for beginning second language (L2) learners, research with more advanced learners is needed to understand how tone perception difficulties impact word recognition once learners have a substantial vocabulary. The present study narrows in on difficulties suggested in previous work, which found a dissociation in advanced L2 learners between highly accurate tone identification and largely inaccurate lexical decision for tone words. We investigate a “best-case scenario” for advanced L2 tone word processing by testing performance in nearly ideal listening conditions—with words spoken clearly and in isolation. Under such conditions, do learners still have difficulty in lexical decision for tone words? If so, is it driven by the quality of lexical representations or by L2 processing routines? Advanced L2 and native Chinese listeners made lexical decisions while an electroencephalogram was recorded. Nonwords had a first syllable with either a vowel or tone that differed from that of a common disyllabic word. As a group, L2 learners performed less accurately when tones were manipulated than when vowels were manipulated. Subsequent analyses showed that this was the case even in the subset of items for which learners showed correct and confident tone identification in an offline written vocabulary test. Event-related potential results indicated N400 effects for both nonword conditions in L1, but only vowel N400 effects in L2, with tone responses intermediate between those of real words and vowel nonwords. These results are evidence of the persistent difficulty most L2 learners have in using tones for online word recognition, and indicate it is driven by a confluence of factors related to both L2 lexical representations and processing routines. We suggest that this tone nonword difficulty has real-world implications for learners: It may result in many toneless word representations in their mental lexicons, and is likely to affect the efficiency with which they can learn new tone words. 
    more » « less
  4. Learning to process speech in a foreign language involves learning new representations for mapping the auditory signal to linguistic structure. Behavioral experiments suggest that even listeners that are highly proficient in a non-native language experience interference from representations of their native language. However, much of the evidence for such interference comes from tasks that may inadvertently increase the salience of native language competitors. Here we tested for neural evidence of proficiency and native language interference in a naturalistic story listening task. We studied electroencephalography responses of 39 native speakers of Dutch (14 male) to an English short story, spoken by a native speaker of either American English or Dutch. We modeled brain responses with multivariate temporal response functions, using acoustic and language models. We found evidence for activation of Dutch language statistics when listening to English, but only when it was spoken with a Dutch accent. This suggests that a naturalistic, monolingual setting decreases the interference from native language representations, whereas an accent in the listener's own native language may increase native language interference, by increasing the salience of the native language and activating native language phonetic and lexical representations. Brain responses suggest that such interference stems from words from the native language competing with the foreign language in a single word recognition system, rather than being activated in a parallel lexicon. We further found that secondary acoustic representations of speech (after 200 ms latency) decreased with increasing proficiency. This may reflect improved acoustic–phonetic models in more proficient listeners.

    Significance StatementBehavioral experiments suggest that native language knowledge interferes with foreign language listening, but such effects may be sensitive to task manipulations, as tasks that increase metalinguistic awareness may also increase native language interference. This highlights the need for studying non-native speech processing using naturalistic tasks. We measured neural responses unobtrusively while participants listened for comprehension and characterized the influence of proficiency at multiple levels of representation. We found that salience of the native language, as manipulated through speaker accent, affected activation of native language representations: significant evidence for activation of native language (Dutch) categories was only obtained when the speaker had a Dutch accent, whereas no significant interference was found to a speaker with a native (American) accent.

     
    more » « less
  5. Abstract

    A notoriously contested subarea of phonological typology is word-prosodic typology, which governs suprasegmental structure (such as tone, syllable structure and stress) at the word level. Within word-prosodic typology, it is widely recognized that some languages have so-called stress systems while others have lexical-tone systems. Other languages appear to have intermediate systems, with properties of both stress and lexically contrastive tone. Certain types of such intermediate systems are at the core of ongoing theoretical debates on the nature of word- prosodic systems, viz. language varieties with contrasts between two word tones that are restricted to the main-stressed syllables of a word, a phenomenon that is often descriptively referred to as tonal accent. In this paper, we aim to show that exploring tone-accent systems in detail has the potential to significantly contribute to word-prosodic typology, specifically concerning the foot as a tool for the analysis of syllable-internal prosodic contrasts. The phonology of tonal accent in Franconian (a variety of West Germanic spoken in parts of Belgium, Germany, and the Netherlands) will be the main piece of evidence supporting our claims, with a focus on predictable interactions between segmental structure and accentuation. A central implication of our analysis is that tonal contrasts within syllables can sometimes derive from two types of feet being active in the same prosodic system. We support the Franconian evidence with analogous tone-segment interactions in Estonian and discuss the relevance of our claims in the broader context of word-prosodic typology.

     
    more » « less