skip to main content


Title: Perceptual consequences of native and non-native clear speech
Native talkers are able to enhance acoustic characteristics of their speech in a speaking style known as “clear speech,” which is better understood by listeners than “plain speech.” However, despite substantial research in the area of clear speech, it is less clear whether non-native talkers of various proficiency levels are able to adopt a clear speaking style and if so, whether this style has perceptual benefits for native listeners. In the present study, native English listeners evaluated plain and clear speech produced by three groups: native English talkers, non-native talkers with lower proficiency, and non-native talkers with higher proficiency. Listeners completed a transcription task (i.e., an objective measure of the speech intelligibility). We investigated intelligibility as a function of language background and proficiency and also investigated the acoustic modifications that are associated with these perceptual benefits. The results of the study suggest that both native and non-native talkers modulate their speech when asked to adopt a clear speaking style, but that the size of the acoustic modifications, as well as consequences of this speaking style for perception differ as a function of language background and language proficiency.  more » « less
Award ID(s):
1941739
NSF-PAR ID:
10325320
Author(s) / Creator(s):
;
Date Published:
Journal Name:
The Journal of the Acoustical Society of America
Volume:
151
Issue:
2
ISSN:
0001-4966
Page Range / eLocation ID:
1246 to 1258
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Previous research has shown that native listeners benefit from clearly produced speech, as well as from predictable semantic context when these enhancements are delivered in native speech. However, it is unclear whether native listeners benefit from acoustic and semantic enhancements differently when listening to other varieties of speech, including non-native speech. The current study examines to what extent native English listeners benefit from acoustic and semantic cues present in native and non-native English speech. Native English listeners transcribed sentence final words that were of different levels of semantic predictability, produced in plain- or clear-speaking styles by Native English talkers and by native Mandarin talkers of higher- and lower-proficiency in English. The perception results demonstrated that listeners benefited from semantic cues in higher- and lower-proficiency talkers’ speech (i.e., transcribed speech more accurately), but not from acoustic cues, even though higher-proficiency talkers did make substantial acoustic enhancements from plain to clear speech. The current results suggest that native listeners benefit more robustly from semantic cues than from acoustic cues when those cues are embedded in non-native speech.

     
    more » « less
  2. Learning to process speech in a foreign language involves learning new representations for mapping the auditory signal to linguistic structure. Behavioral experiments suggest that even listeners that are highly proficient in a non-native language experience interference from representations of their native language. However, much of the evidence for such interference comes from tasks that may inadvertently increase the salience of native language competitors. Here we tested for neural evidence of proficiency and native language interference in a naturalistic story listening task. We studied electroencephalography responses of 39 native speakers of Dutch (14 male) to an English short story, spoken by a native speaker of either American English or Dutch. We modeled brain responses with multivariate temporal response functions, using acoustic and language models. We found evidence for activation of Dutch language statistics when listening to English, but only when it was spoken with a Dutch accent. This suggests that a naturalistic, monolingual setting decreases the interference from native language representations, whereas an accent in the listener's own native language may increase native language interference, by increasing the salience of the native language and activating native language phonetic and lexical representations. Brain responses suggest that such interference stems from words from the native language competing with the foreign language in a single word recognition system, rather than being activated in a parallel lexicon. We further found that secondary acoustic representations of speech (after 200 ms latency) decreased with increasing proficiency. This may reflect improved acoustic–phonetic models in more proficient listeners.

    Significance StatementBehavioral experiments suggest that native language knowledge interferes with foreign language listening, but such effects may be sensitive to task manipulations, as tasks that increase metalinguistic awareness may also increase native language interference. This highlights the need for studying non-native speech processing using naturalistic tasks. We measured neural responses unobtrusively while participants listened for comprehension and characterized the influence of proficiency at multiple levels of representation. We found that salience of the native language, as manipulated through speaker accent, affected activation of native language representations: significant evidence for activation of native language (Dutch) categories was only obtained when the speaker had a Dutch accent, whereas no significant interference was found to a speaker with a native (American) accent.

     
    more » « less
  3. This study uses non-native perception data to examine the relationship between perceived phonetic similarity of segments and their phonological patterning. Segments that are phonetically similar to one another are anticipated to pattern together phonologically, and segments that share articulatory or acoustic properties are also expected to be perceived as similar. What is not yet clear is whether segments that pattern together phonologically are perceived as similar. This study addresses this question by examining how L1 English listeners and L1 Guébie listeners perceive non-native implosive consonants compared with plosives and sonorants. English does not have contrastive implosives, whereas Guébie has a bilabial implosive. The bilabial implosive phonologically patterns with sonorants in Guébie, to the exclusion of obstruents. Two perception experiments show English listeners make more perceptual categorization errors between implosives and voiced plosives than Guébie listeners do, but both listener groups are more likely to classify implosives as similar to voiced plosives than sonorants. The results also show that Guébie listeners are better at categorizing non-native implosive consonants (i.e., alveolar implosives) than English listeners, showing that listeners are able to extend features or gestures from their L1 to non-native implosive consonants. The results of these experiments suggest a cross-linguistic perceptual similarity hierarchy of implosives compared with other segments that are not affected by L1 phonological patterning.

     
    more » « less
  4. Listeners attend to variation in segmental and prosodic cues when judging accent strength. The relative contributions of these cues to perceptions of accentedness in English remains open for investigation, although objective accent distance measures (such as Levenshtein distance) appear to be reliable tools for predicting perceptual distance. Levenshtein distance, however, only accounts for phonemic information in the signal. The purpose of the current study was to examine the relative contributions of phonemic (Levenshtein) and holistic acoustic (dynamic time warping) distances from the local accent to listeners’ accent rankings for nine non-local native and nonnative accents. Listeners (n =52) ranked talkers on perceived distance from the local accent (Midland American English) using a ladder task for three sentence-length stimuli. Phonemic and holistic acoustic distances between Midland American English and the other accents were quantified using both weighted and unweighted Levenshtein distance measures, and dynamic time warping (DTW). Results reveal that all three metrics contribute to perceived accent distance, with the weighted Levenshtein slightly outperforming the other measures. Moreover, the relative contribution of phonemic and holistic acoustic cues was driven by the speaker’s accent. Both nonnative and non-local native accents were included in this study, and the benefits of considering both of these accent groups in studying phonemic and acoustic cues used by listeners is discussed. 
    more » « less
  5. null (Ed.)
    Speech alignment is where talkers subconsciously adopt the speech and language patterns of their interlocutor. Nowadays, people of all ages are speaking with voice-activated, artificially-intelligent (voice-AI) digital assistants through phones or smart speakers. This study examines participants’ age (older adults, 53–81 years old vs. younger adults, 18–39 years old) and gender (female and male) on degree of speech alignment during shadowing of (female and male) human and voice-AI (Apple’s Siri) productions. Degree of alignment was assessed holistically via a perceptual ratings AXB task by a separate group of listeners. Results reveal that older and younger adults display distinct patterns of alignment based on humanness and gender of the human model talkers: older adults displayed greater alignment toward the female human and device voices, while younger adults aligned to a greater extent toward the male human voice. Additionally, there were other gender-mediated differences observed, all of which interacted with model talker category (voice-AI vs. human) or shadower age category (OA vs. YA). Taken together, these results suggest a complex interplay of social dynamics in alignment, which can inform models of speech production both in human-human and human-device interaction. 
    more » « less