Objective: This study investigated the degrees of lenition, or consonantal weakening, in the production of Spanish stop consonants by native English speakers during a study abroad (SA) program. Lenition is a key phonological process in Spanish, where voiced stops (/b/, /d/, /ɡ/) typically weaken to fricatives or approximants in specific phonetic environments. For L2 learners, mastering this subtle process is essential for achieving native-like pronunciation. Methods: To assess the learners’ progress in acquiring lenition, we employed Phonet, a deep learning model. Unlike traditional quantitative acoustic methods that focus on measuring the physical properties of speech sounds, Phonet utilizes recurrent neural networks to predict the posterior probabilities of phonological features, particularly sonorant and continuant characteristics, which are central to the lenition process. Results: The results indicated that while learners showed progress in producing the fricative-like variants of lenition during the SA program and understood how to produce lenition in appropriate contexts, the retention of these phonological gains was not sustained after their return. Additionally, unlike native speakers, the learners never fully achieved the approximant-like realization of lenition. Conclusions: These findings underscore the need for sustained exposure and practice beyond the SA experience to ensure the long-term retention of L2 phonological patterns. While SA programs offer valuable opportunities for enhancing L2 pronunciation, they should be supplemented with ongoing support to consolidate and extend the gains achieved during the immersive experience.
more »
« less
This content will become publicly available on January 1, 2026
Measuring the Impact of Segmental Deviation on Perceptions of Accentedness using Gradient Phonological Class Features
Using Phonet (Vásquez-Correa et al., 2019), a neural network-based model, we generate vector representations of speech segments consisting of phonological class probabilities and use these representations to quantify segmental deviations in the English of native Hindi speakers from American English (AE) and Indian English (IE) baselines, in order to explain how these deviations impact perceptions of accentedness by native AE speakers. The primary focus is on three AE phonemes and their realizations in Hindi English (HE) and Indian English: the labiovelar approximant /w/, often produced as the labiodental approximant [ʋ]; the alveolar stop /t/, commonly realized as the retroflex stop [ʈ]; and the rhotic approximant /ɹ/,rendered as the rhotic tap [ɾ]. Multinomial logistic regressions of Euclidean distances from HE sements to AE/IE baselines on accent ratings show that larger distances from AE baselines increase the likelihood of perceiving stronger accents while larger distances from IE baselines decrease the likelihood. Changes in the probability distributions of contrastive phonological classes are found to correlate with the strength of the perceived accent. These results offer valuable insights into the interplay between native phonology and the perception of accented speech.
more »
« less
- Award ID(s):
- 2037266
- PAR ID:
- 10639426
- Publisher / Repository:
- University of Massachusetts Amherst Libraries
- Date Published:
- Journal Name:
- Society for Computation in Linguistics
- Volume:
- 8
- Issue:
- 1
- ISSN:
- 2834-1007
- Subject(s) / Keyword(s):
- SLM SLM-r PAM Neural Networks accentedness phonology
- Format(s):
- Medium: X Other: application/pdf
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This study investigated the acquisition of lenition in Spanish voiced stops (/b, d, ɡ/) by native English speakers during a study-abroad program, focusing on individual differences and in昀氀uencing factors. Lenition, characterized by the weakening of stops into fricative-like ([β], [ð], [ɣ]) or approximant-like ([β̞], [ð̞], [ɣ̞ ]) forms, poses challenges for L2 learners due to its gradient nature and the absence of analogous approximant forms in English. Results indicated that learners aligned with native speakers in recognizing voicing as the primary cue for lenition, yet their productions diverged, favoring fricative-like over approximant-like realizations. This preference re昀氀ects the combined in昀氀uence of articulatory ease, acoustic salience, and cognitive demands. Individual variability in learners’ trajectories highlights the role of exposure to native input and sociolinguistic engagement. Learners bene昀椀tting from richer, informal interactions with native speakers showed greater alignment with native patterns, while others demonstrated more limited progress. However, native input alone was insuf昀椀cient for learners to internalize subtler distinctions such as place of articulation and stress. These 昀椀ndings emphasize the need for combining immersive experiences with targeted instructional strategies to address articulatory and cognitive challenges. This study contributes to the understanding of L2 phonological acquisition and offers insights for designing more effective language learning programs to support lenition acquisition in Spanish.more » « less
-
Learning to process speech in a foreign language involves learning new representations for mapping the auditory signal to linguistic structure. Behavioral experiments suggest that even listeners that are highly proficient in a non-native language experience interference from representations of their native language. However, much of the evidence for such interference comes from tasks that may inadvertently increase the salience of native language competitors. Here we tested for neural evidence of proficiency and native language interference in a naturalistic story listening task. We studied electroencephalography responses of 39 native speakers of Dutch (14 male) to an English short story, spoken by a native speaker of either American English or Dutch. We modeled brain responses with multivariate temporal response functions, using acoustic and language models. We found evidence for activation of Dutch language statistics when listening to English, but only when it was spoken with a Dutch accent. This suggests that a naturalistic, monolingual setting decreases the interference from native language representations, whereas an accent in the listener's own native language may increase native language interference, by increasing the salience of the native language and activating native language phonetic and lexical representations. Brain responses suggest that such interference stems from words from the native language competing with the foreign language in a single word recognition system, rather than being activated in a parallel lexicon. We further found that secondary acoustic representations of speech (after 200 ms latency) decreased with increasing proficiency. This may reflect improved acoustic–phonetic models in more proficient listeners. Significance StatementBehavioral experiments suggest that native language knowledge interferes with foreign language listening, but such effects may be sensitive to task manipulations, as tasks that increase metalinguistic awareness may also increase native language interference. This highlights the need for studying non-native speech processing using naturalistic tasks. We measured neural responses unobtrusively while participants listened for comprehension and characterized the influence of proficiency at multiple levels of representation. We found that salience of the native language, as manipulated through speaker accent, affected activation of native language representations: significant evidence for activation of native language (Dutch) categories was only obtained when the speaker had a Dutch accent, whereas no significant interference was found to a speaker with a native (American) accent.more » « less
-
Purpose The “bubble noise” technique has recently been introduced as a method to identify the regions in time–frequency maps (i.e., spectrograms) of speech that are especially important for listeners in speech recognition. This technique identifies regions of “importance” that are specific to the speech stimulus and the listener, thus permitting these regions to be compared across different listener groups. For example, in cross-linguistic and second-language (L2) speech perception, this method identifies differences in regions of importance in accomplishing decisions of phoneme category membership. This research note describes the application of bubble noise to the study of language learning for 3 different language pairs: Hindi English bilinguals' perception of the /v/–/w/ contrast in American English, native English speakers' perception of the tense/lax contrast for Korean fricatives and affricates, and native English speakers' perception of Mandarin lexical tone. Conclusion We demonstrate that this technique provides insight on what information in the speech signal is important for native/first-language listeners compared to nonnative/L2 listeners. Furthermore, the method can be used to examine whether L2 speech perception training is effective in bringing the listener's attention to the important cues.more » « less
-
Abstract Communicating with a speaker with a different accent can affect one’s own speech. Despite the strength of evidence for perception-production transfer in speech, the nature of transfer has remained elusive, with variable results regarding the acoustic properties that transfer between speakers and the characteristics of the speakers who exhibit transfer. The current study investigates perception-production transfer through the lens of statistical learning across passive exposure to speech. Participants experienced a short sequence of acoustically variable minimal pair (beer/pier) utterances conveying either an accent or typical American English acoustics, categorized a perceptually ambiguous test stimulus, and then repeated the test stimulus aloud. In thecanonicalcondition, /b/–/p/ fundamental frequency (F0) and voice onset time (VOT) covaried according to typical English patterns. In thereversecondition, the F0xVOT relationship reversed to create an “accent” with speech input regularities atypical of American English. Replicating prior studies, F0 played less of a role in perceptual speech categorization in reverse compared with canonical statistical contexts. Critically, this down-weighting transferred to production, with systematic down-weighting of F0 in listeners’ own speech productions in reverse compared with canonical contexts that was robust across male and female participants. Thus, the mapping of acoustics to speech categories is rapidly adjusted by short-term statistical learning across passive listening and these adjustments transfer to influence listeners’ own speech productions.more » « less
An official website of the United States government
