skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Individual variability in subcortical neural encoding shapes phonetic cue weighting
Abstract Recent studies have revealed great individual variability in cue weighting, and such variation is shown to be systematic across individuals and linked to differences in some general cognitive mechanism. The present study investigated the role of subcortical encoding as a source of individual variability in cue weighting by focusing on English listeners’ frequency following responses to the tense/lax English vowel contrast varying in spectral and durational cues. Listeners differed in early auditory encoding with some encoding the spectral cue more veridically than the durational one, while others exhibited the reverse pattern. These differences in cue encoding further correlate with behavioral variability in cue weighting, suggesting that specificity in cue encoding across individuals modulates how cues are weighted in downstream processes.  more » « less
Award ID(s):
1827409
PAR ID:
10440815
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Scientific Reports
Volume:
13
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Recent studies have documented substantial variability among typical listeners in how gradiently they categorize speech sounds, and this variability in categorization gradience may link to how listeners weight different cues in the incoming signal. The present study tested the relationship between categorization gradience and cue weighting across two sets of English contrasts, each varying orthogonally in two acoustic dimensions. Participants performed a four‐alternative forced‐choice identification task in a visual world paradigm while their eye movements were monitored. We found that (a) greater categorization gradience derived from behavioral identification responses corresponds to larger secondary cue weights derived from eye movements; (b) the relationship between categorization gradience and secondary cue weighting is observed across cues and contrasts, suggesting that categorization gradience may be a consistent within‐individual property in speech perception; and (c) listeners who showed greater categorization gradience tend to adopt a buffered processing strategy, especially when cues arrive asynchronously in time. 
    more » « less
  2. This study examines apparent-time variation in the use of multiple acoustic cues present on coarticulatorily nasalized vowels in California English. Eighty-nine listeners ranging in age from 18-58 (grouped into 3 apparent-time categories based on year of birth) performed lexical identifications on syllables excised from words with oral and nasal codas from six speakers who produced either minimal (n=3) or extensive (n=3) anticipatory nasal coarticulation (realized by greater vowel nasalization, F1 bandwidth, and diphthongization on vowels in CVN contexts). Results showed no differences across listeners’ identification for Extensively coarticulated vowels, as well as oral vowels by both types of speakers (all at-ceiling). Yet, performance for the Minimal Coarticulators’ nasalized vowels was lowest for the older listener group and increased over apparent-time. Perceptual cue-weighting analyses revealed that older listeners rely more on F1 bandwidth, while younger listeners rely more on acoustic nasality, as coarticulatory cues providing information about lexical identity. Thus, there is evidence for variation in apparent- time in the use of the different coarticulatory cues present on vowels. Younger listeners’ cue weighting allows them flexibility to identify lexical items given a range of coarticulatory variation across (here, younger) speakers, while older listeners’ cue weighting leads to reduced performance for talkers producing innovative phonetic forms. This study contributes to our understanding of the relationship between multidimensional acoustic features resulting from coarticulation and the perceptual re-weighting of cues that can lead to sound change over time. 
    more » « less
  3. Speech categories are defined by multiple acoustic dimensions and their boundaries are generally fuzzy and ambiguous in part because listeners often give differential weighting to these cue dimensions during phonetic categorization. This study explored how a listener's perception of a speaker's socio-indexical and personality characteristics influences the listener's perceptual cue weighting. In a matched-guise study, three groups of listeners classified a series of gender-neutral /b/-/p/ continua that vary in VOT and F0 at the onset of the following vowel. Listeners were assigned to one of three prompt conditions (i.e., a visually male talker, a visually female talker, or audio-only) and rated the talker in terms of vocal (and facial, in the visual prompt conditions) gender prototypicality, attractiveness, friendliness, confidence, trustworthiness, and gayness. Male listeners and listeners who saw a male face showed less reliance on VOT compared to listeners in the other conditions. Listeners' visual evaluation of the talker also affected their weighting of VOT and onset F0 cues, although the effects of facial impressions differ depending on the gender of the listener. The results demonstrate that individual differences in perceptual cue weighting are modulated by the listener's gender and his/her subjective evaluation of the talker. These findings lend support for exemplar-based models of speech perception and production where socio-indexical features are encoded as a part of the episodic traces in the listeners' mental lexicon. This study also shed light on the relationship between individual variation in cue weighting and community-level sound change by demonstrating that VOT and onset F0 co-variation in North American English has acquired a certain degree of socio-indexical significance. 
    more » « less
  4. Anticipatory coarticulation is a highly informative cue to upcoming linguistic information: listeners can identify that the word is ben and not bed by hearing the vowel alone. The present study compares the relative performances of human listeners and a self-supervised pre-trained speech model (wav2vec 2.0) in the use of nasal coarticulation to classify vowels. Stimuli consisted of nasalized (from CVN words) and non-nasalized (from CVCs) American English vowels produced by 60 humans and generated in 36 TTS voices. wav2vec 2.0 performance is similar to human listener performance, in aggregate. Broken down by vowel type: both wav2vec 2.0 and listeners perform higher for non-nasalized vowels produced naturally by humans. However, wav2vec 2.0 shows higher correct classification performance for nasalized vowels, than for non-nasalized vowels, for TTS voices. Speaker-level patterns reveal that listeners' use of coarticulation is highly variable across talkers. wav2vec 2.0 also shows cross-talker variability in performance. Analyses also reveal differences in the use of multiple acoustic cues in nasalized vowel classifications across listeners and the wav2vec 2.0. Findings have implications for understanding how coarticulatory variation is used in speech perception. Results also can provide insight into how neural systems learn to attend to the unique acoustic features of coarticulation. 
    more » « less
  5. This study examines the acoustic realizations of American English intervocalic flaps in the TIMIT corpus, using the landmark-critical feature-cue-based framework. Three different acoustic patterns of flaps are described: (i) both closure and release landmarks present, (ii) only the closure landmark present, and (iii) both landmarks deleted. The patterns occur consistently across several phonological and morphological conditions but vary with sociolinguistic factors, including speaker dialect and gender. This method of analysing speech at the level of acoustic landmarks and other individual cues to distinctive features contributes to a deeper understanding of how speakers and listeners employ systematic variation in phonetic detail in speech processing. 
    more » « less