Title: Individual differences in the perception of phonetic category structure predict speech-in-noise performance
Speech sounds exist in a complex acoustic–phonetic space, and listeners vary in the extent to which they are sensitive to variability within the speech sound category (“gradience”) and the degree to which they show stable, consistent responses to phonetic stimuli. Here, we investigate the hypothesis that individual differences in the perception of the sound categories of one's language may aid speech-in-noise performance across the adult lifespan. Declines in speech-in-noise performance are well documented in healthy aging, and are, unsurprisingly, associated with differences in hearing ability. Nonetheless, hearing status and age are incomplete predictors of speech-in-noise performance, and long-standing research suggests that this ability draws on more complex cognitive and perceptual factors. In this study, a group of adults ranging in age from 18 to 67 years performed online assessments designed to measure phonetic category sensitivity, questionnaires querying recent noise exposure history and demographic factors, and crucially, a test of speech-in-noise perception. Results show that individual differences in the perception of two consonant contrasts significantly predict speech-in-noise performance, even after accounting for age and recent noise exposure history. This finding supports the hypothesis that individual differences in sensitivity to phonetic categories mediates speech perception in challenging listening situations. more »« less
Feldman, Naomi H.; Goldwater, Sharon; Dupoux, Emmanuel; Schatz, Thomas
(, Open Mind)
null
(Ed.)
Abstract Early changes in infants’ ability to perceive native and nonnative speech sound contrasts are typically attributed to their developing knowledge of phonetic categories. We critically examine this hypothesis and argue that there is little direct evidence of category knowledge in infancy. We then propose an alternative account in which infants’ perception changes because they are learning a perceptual space that is appropriate to represent speech, without yet carving up that space into phonetic categories. If correct, this new account has substantial implications for understanding early language development.
Luthra, Sahil; Correia, João M.; Kleinschmidt, Dave F.; Mesite, Laura; Myers, Emily B.
(, Journal of Cognitive Neuroscience)
null
(Ed.)
Abstract A listener's interpretation of a given speech sound can vary probabilistically from moment to moment. Previous experience (i.e., the contexts in which one has encountered an ambiguous sound) can further influence the interpretation of speech, a phenomenon known as perceptual learning for speech. This study used multivoxel pattern analysis to query how neural patterns reflect perceptual learning, leveraging archival fMRI data from a lexically guided perceptual learning study conducted by Myers and Mesite [Myers, E. B., & Mesite, L. M. Neural systems underlying perceptual adjustment to non-standard speech tokens. Journal of Memory and Language, 76, 80–93, 2014]. In that study, participants first heard ambiguous /s/–/∫/ blends in either /s/-biased lexical contexts (epi_ode) or /∫/-biased contexts (refre_ing); subsequently, they performed a phonetic categorization task on tokens from an /asi/–/a∫i/ continuum. In the current work, a classifier was trained to distinguish between phonetic categorization trials in which participants heard unambiguous productions of /s/ and those in which they heard unambiguous productions of /∫/. The classifier was able to generalize this training to ambiguous tokens from the middle of the continuum on the basis of individual participants' trial-by-trial perception. We take these findings as evidence that perceptual learning for speech involves neural recalibration, such that the pattern of activation approximates the perceived category. Exploratory analyses showed that left parietal regions (supramarginal and angular gyri) and right temporal regions (superior, middle, and transverse temporal gyri) were most informative for categorization. Overall, our results inform an understanding of how moment-to-moment variability in speech perception is encoded in the brain.
Baese-Berk, Melissa M.; Chandrasekaran, Bharath; Roark, Casey L.
(, The Journal of the Acoustical Society of America)
Most current theories and models of second language speech perception are grounded in the notion that learners acquire speech sound categories in their target language. In this paper, this classic idea in speech perception is revisited, given that clear evidence for formation of such categories is lacking in previous research. To understand the debate on the nature of speech sound representations in a second language, an operational definition of “category” is presented, and the issues of categorical perception and current theories of second language learning are reviewed. Following this, behavioral and neuroimaging evidence for and against acquisition of categorical representations is described. Finally, recommendations for future work are discussed. The paper concludes with a recommendation for integration of behavioral and neuroimaging work and theory in this area.
Saltzman, David I.; Myers, Emily B.
(, Neurobiology of Language)
The extent that articulatory information embedded in incoming speech contributes to the formation of new perceptual categories for speech sounds has been a matter of discourse for decades. It has been theorized that the acquisition of new speech sound categories requires a network of sensory and speech motor cortical areas (the “dorsal stream”) to successfully integrate auditory and articulatory information. However, it is possible that these brain regions are not sensitive specifically to articulatory information, but instead are sensitive to the abstract phonological categories being learned. We tested this hypothesis by training participants over the course of several days on an articulable non-native speech contrast and acoustically matched inarticulable nonspeech analogues. After reaching comparable levels of proficiency with the two sets of stimuli, activation was measured in fMRI as participants passively listened to both sound types. Decoding of category membership for the articulable speech contrast alone revealed a series of left and right hemisphere regions outside of the dorsal stream that have previously been implicated in the emergence of non-native speech sound categories, while no regions could successfully decode the inarticulable nonspeech contrast. Although activation patterns in the left inferior frontal gyrus (IFG), the middle temporal gyrus (MTG), and the supplementary motor area (SMA) provided better information for decoding articulable (speech) sounds compared to the inarticulable (sine wave) sounds, the finding that dorsal stream regions do not emerge as good decoders of the articulable contrast alone suggests that other factors, including the strength and structure of the emerging speech categories are more likely drivers of dorsal stream activation for novel sound learning.
Johns, Michael A.; Calloway, Regina C.; Phillips, Ian; Karuzis, Valerie P.; Dutta, Kelsey; Smith, Ed; Shamma, Shihab A.; Goupell, Matthew J.; Kuchinsky, Stefanie E.
(, The Journal of the Acoustical Society of America)
Speech recognition in noisy environments can be challenging and requires listeners to accurately segregate a target speaker from irrelevant background noise. Stochastic figure-ground (SFG) tasks in which temporally coherent inharmonic pure-tones must be identified from a background have been used to probe the non-linguistic auditory stream segregation processes important for speech-in-noise processing. However, little is known about the relationship between performance on SFG tasks and speech-in-noise tasks nor the individual differences that may modulate such relationships. In this study, 37 younger normal-hearing adults performed an SFG task with target figure chords consisting of four, six, eight, or ten temporally coherent tones amongst a background of randomly varying tones. Stimuli were designed to be spectrally and temporally flat. An increased number of temporally coherent tones resulted in higher accuracy and faster reaction times (RTs). For ten target tones, faster RTs were associated with better scores on the Quick Speech-in-Noise task. Individual differences in working memory capacity and self-reported musicianship further modulated these relationships. Overall, results demonstrate that the SFG task could serve as an assessment of auditory stream segregation accuracy and RT that is sensitive to individual differences in cognitive and auditory abilities, even among younger normal-hearing adults.
Myers, Emily, Phillips, Matthew, and Skoe, Erika. Individual differences in the perception of phonetic category structure predict speech-in-noise performance. Retrieved from https://par.nsf.gov/biblio/10618493. The Journal of the Acoustical Society of America 156.3 Web. doi:10.1121/10.0028583.
Myers, Emily, Phillips, Matthew, & Skoe, Erika. Individual differences in the perception of phonetic category structure predict speech-in-noise performance. The Journal of the Acoustical Society of America, 156 (3). Retrieved from https://par.nsf.gov/biblio/10618493. https://doi.org/10.1121/10.0028583
Myers, Emily, Phillips, Matthew, and Skoe, Erika.
"Individual differences in the perception of phonetic category structure predict speech-in-noise performance". The Journal of the Acoustical Society of America 156 (3). Country unknown/Code not available: AIP Publishing. https://doi.org/10.1121/10.0028583.https://par.nsf.gov/biblio/10618493.
@article{osti_10618493,
place = {Country unknown/Code not available},
title = {Individual differences in the perception of phonetic category structure predict speech-in-noise performance},
url = {https://par.nsf.gov/biblio/10618493},
DOI = {10.1121/10.0028583},
abstractNote = {Speech sounds exist in a complex acoustic–phonetic space, and listeners vary in the extent to which they are sensitive to variability within the speech sound category (“gradience”) and the degree to which they show stable, consistent responses to phonetic stimuli. Here, we investigate the hypothesis that individual differences in the perception of the sound categories of one's language may aid speech-in-noise performance across the adult lifespan. Declines in speech-in-noise performance are well documented in healthy aging, and are, unsurprisingly, associated with differences in hearing ability. Nonetheless, hearing status and age are incomplete predictors of speech-in-noise performance, and long-standing research suggests that this ability draws on more complex cognitive and perceptual factors. In this study, a group of adults ranging in age from 18 to 67 years performed online assessments designed to measure phonetic category sensitivity, questionnaires querying recent noise exposure history and demographic factors, and crucially, a test of speech-in-noise perception. Results show that individual differences in the perception of two consonant contrasts significantly predict speech-in-noise performance, even after accounting for age and recent noise exposure history. This finding supports the hypothesis that individual differences in sensitivity to phonetic categories mediates speech perception in challenging listening situations.},
journal = {The Journal of the Acoustical Society of America},
volume = {156},
number = {3},
publisher = {AIP Publishing},
author = {Myers, Emily and Phillips, Matthew and Skoe, Erika},
}
Warning: Leaving National Science Foundation Website
You are now leaving the National Science Foundation website to go to a non-government website.
Website:
NSF takes no responsibility for and exercises no control over the views expressed or the accuracy of
the information contained on this site. Also be aware that NSF's privacy policy does not apply to this site.