skip to main content


Title: Neural Representation of Articulable and Inarticulable Novel Sound Contrasts: The Role of the Dorsal Stream
The extent that articulatory information embedded in incoming speech contributes to the formation of new perceptual categories for speech sounds has been a matter of discourse for decades. It has been theorized that the acquisition of new speech sound categories requires a network of sensory and speech motor cortical areas (the “dorsal stream”) to successfully integrate auditory and articulatory information. However, it is possible that these brain regions are not sensitive specifically to articulatory information, but instead are sensitive to the abstract phonological categories being learned. We tested this hypothesis by training participants over the course of several days on an articulable non-native speech contrast and acoustically matched inarticulable nonspeech analogues. After reaching comparable levels of proficiency with the two sets of stimuli, activation was measured in fMRI as participants passively listened to both sound types. Decoding of category membership for the articulable speech contrast alone revealed a series of left and right hemisphere regions outside of the dorsal stream that have previously been implicated in the emergence of non-native speech sound categories, while no regions could successfully decode the inarticulable nonspeech contrast. Although activation patterns in the left inferior frontal gyrus (IFG), the middle temporal gyrus (MTG), and the supplementary motor area (SMA) provided better information for decoding articulable (speech) sounds compared to the inarticulable (sine wave) sounds, the finding that dorsal stream regions do not emerge as good decoders of the articulable contrast alone suggests that other factors, including the strength and structure of the emerging speech categories are more likely drivers of dorsal stream activation for novel sound learning.  more » « less
Award ID(s):
1735225
NSF-PAR ID:
10177581
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Neurobiology of Language
ISSN:
2641-4368
Page Range / eLocation ID:
1 to 26
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Humans are born as “universal listeners” without a bias toward any particular language. However, over the first year of life, infants’ perception is shaped by learning native speech categories. Acoustically different sounds—such as the same word produced by different speakers—come to be treated as functionally equivalent. In natural environments, these categories often emerge incidentally without overt categorization or explicit feedback. However, the neural substrates of category learning have been investigated almost exclusively using overt categorization tasks with explicit feedback about categorization decisions. Here, we examined whether the striatum, previously implicated in category learning, contributes to incidental acquisition of sound categories. In the fMRI scanner, participants played a videogame in which sound category exemplars aligned with game actions and events, allowing sound categories to incidentally support successful game play. An experimental group heard nonspeech sound exemplars drawn from coherent category spaces, whereas a control group heard acoustically similar sounds drawn from a less structured space. Although the groups exhibited similar in-game performance, generalization of sound category learning and activation of the posterior striatum were significantly greater in the experimental than control group. Moreover, the experimental group showed brain–behavior relationships related to the generalization of all categories, while in the control group these relationships were restricted to the categories with structured sound distributions. Together, these results demonstrate that the striatum, through its interactions with the left superior temporal sulcus, contributes to incidental acquisition of sound category representations emerging from naturalistic learning environments.

     
    more » « less
  2. Abstract Lay summary

    Individuals with ASD and schizophrenia are more likely to perceive asynchronous auditory and visual events as occurring simultaneously even if they are well separated in time. We investigated whether similar difficulties in audiovisual temporal processing were present in subclinical populations with high autistic and schizotypal traits. We found that the ability to detect audiovisual asynchrony was not affected by different levels of autistic and schizotypal traits. We also found that connectivity of some brain regions engaging in multisensory and timing tasks might explain an individual's tendency to bind multisensory information within a wide or narrow time window.Autism Res2021, 14: 668–680. © 2020 International Society for Autism Research and Wiley Periodicals LLC

     
    more » « less
  3. Abstract

    A longstanding debate has surrounded the role of the motor system in speech perception, but progress in this area has been limited by tasks that only examine isolated syllables and conflate decision-making with perception. Using an adaptive task that temporally isolates perception from decision-making, we examined an EEG signature of motor activity (sensorimotor μ/beta suppression) during the perception of auditory phonemes, auditory words, audiovisual words, and environmental sounds while holding difficulty constant at two levels (Easy/Hard). Results revealed left-lateralized sensorimotor μ/beta suppression that was related to perception of speech but not environmental sounds. Audiovisual word and phoneme stimuli showed enhanced left sensorimotor μ/beta suppression for correct relative to incorrect trials, while auditory word stimuli showed enhanced suppression for incorrect trials. Our results demonstrate that motor involvement in perception is left-lateralized, is specific to speech stimuli, and it not simply the result of domain-general processes. These results provide evidence for an interactive network for speech perception in which dorsal stream motor areas are dynamically engaged during the perception of speech depending on the characteristics of the speech signal. Crucially, this motor engagement has different effects on the perceptual outcome depending on the lexicality and modality of the speech stimulus.

     
    more » « less
  4. Bizley, Jennifer K. (Ed.)

    Hearing one’s own voice is critical for fluent speech production as it allows for the detection and correction of vocalization errors in real time. This behavior known as the auditory feedback control of speech is impaired in various neurological disorders ranging from stuttering to aphasia; however, the underlying neural mechanisms are still poorly understood. Computational models of speech motor control suggest that, during speech production, the brain uses an efference copy of the motor command to generate an internal estimate of the speech output. When actual feedback differs from this internal estimate, an error signal is generated to correct the internal estimate and update necessary motor commands to produce intended speech. We were able to localize the auditory error signal using electrocorticographic recordings from neurosurgical participants during a delayed auditory feedback (DAF) paradigm. In this task, participants hear their voice with a time delay as they produced words and sentences (similar to an echo on a conference call), which is well known to disrupt fluency by causing slow and stutter-like speech in humans. We observed a significant response enhancement in auditory cortex that scaled with the duration of feedback delay, indicating an auditory speech error signal. Immediately following auditory cortex, dorsal precentral gyrus (dPreCG), a region that has not been implicated in auditory feedback processing before, exhibited a markedly similar response enhancement, suggesting a tight coupling between the 2 regions. Critically, response enhancement in dPreCG occurred only during articulation of long utterances due to a continuous mismatch between produced speech and reafferent feedback. These results suggest that dPreCG plays an essential role in processing auditory error signals during speech production to maintain fluency.

     
    more » « less
  5. Adults struggle to learn non-native speech categories in many experimental settings (Goto, 1971), but learn efficiently in a video game paradigm where non-native speech sounds have functional significance (Lim and Holt, 2011). Behavioral and neural evidence from this and other paradigms point toward the involvement of reinforcement learning mechanisms in speech category learning. We formalize this hypothesis computationally and present two simulations. The first simulates the findings of Lim et al. (2019), providing proof in principle that a reinforcement learning algorithm can successfully capture human results in a video game where people are learning novel categories of noise tokens. Our second simulation extends this to speech sounds and demonstrates that our algorithm mimics second language learners’ improvement on discrimination of a non-native speech contrast. Together these two simulations show that reinforcement learning provides an accurate model of human learning in this paradigm and provide evidence supporting the hypothesis that this mechanism could play a key role in effective speech category learning in adults. Being able to identify the algorithms employed in this paradigm could provide many avenues for pedagogical changes in second language learning and let teachers harness the processes that allow for efficient learning and improvement of non-native perceptual ability. 
    more » « less