Humans are born as “universal listeners” without a bias toward any particular language. However, over the first year of life, infants’ perception is shaped by learning native speech categories. Acoustically different sounds—such as the same word produced by different speakers—come to be treated as functionally equivalent. In natural environments, these categories often emerge incidentally without overt categorization or explicit feedback. However, the neural substrates of category learning have been investigated almost exclusively using overt categorization tasks with explicit feedback about categorization decisions. Here, we examined whether the striatum, previously implicated in category learning, contributes to incidental acquisition of sound categories. In the fMRI scanner, participants played a videogame in which sound category exemplars aligned with game actions and events, allowing sound categories to incidentally support successful game play. An experimental group heard nonspeech sound exemplars drawn from coherent category spaces, whereas a control group heard acoustically similar sounds drawn from a less structured space. Although the groups exhibited similar in-game performance, generalization of sound category learning and activation of the posterior striatum were significantly greater in the experimental than control group. Moreover, the experimental group showed brain–behavior relationships related to the generalization of all categories, while in the control group these relationships were restricted to the categories with structured sound distributions. Together, these results demonstrate that the striatum, through its interactions with the left superior temporal sulcus, contributes to incidental acquisition of sound category representations emerging from naturalistic learning environments.
- Award ID(s):
- 1735225
- NSF-PAR ID:
- 10177581
- Date Published:
- Journal Name:
- Neurobiology of Language
- ISSN:
- 2641-4368
- Page Range / eLocation ID:
- 1 to 26
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Temporal proximity is an important clue for multisensory integration. Previous evidence indicates that individuals with autism and schizophrenia are more likely to integrate multisensory inputs over a longer temporal binding window (TBW). However, whether such deficits in audiovisual temporal integration extend to subclinical populations with high schizotypal and autistic traits are unclear. Using audiovisual simultaneity judgment (SJ) tasks for nonspeech and speech stimuli, our results suggested that the width of the audiovisual TBW was not significantly correlated with self‐reported schizotypal and autistic traits in a group of young adults. Functional magnetic resonance imaging (fMRI) resting‐state activity was also acquired to explore the neural correlates underlying inter‐individual variability of TBW width. Across the entire sample, stronger resting‐state functional connectivity (rsFC) between the left superior temporal cortex and the left precuneus, and weaker rsFC between the left cerebellum and the right dorsal lateral prefrontal cortex were correlated with a narrower TBW for speech stimuli. Meanwhile, stronger rsFC between the left anterior superior temporal gyrus and the right inferior temporal gyrus was correlated with a wider audiovisual TBW for non‐speech stimuli. The TBW‐related rsFC was not affected by levels of subclinical traits. In conclusion, this study indicates that audiovisual temporal processing may not be affected by autistic and schizotypal traits and rsFC between brain regions responding to multisensory information and timing may account for the inter‐individual difference in TBW width.
Lay summary Individuals with ASD and schizophrenia are more likely to perceive asynchronous auditory and visual events as occurring simultaneously even if they are well separated in time. We investigated whether similar difficulties in audiovisual temporal processing were present in subclinical populations with high autistic and schizotypal traits. We found that the ability to detect audiovisual asynchrony was not affected by different levels of autistic and schizotypal traits. We also found that connectivity of some brain regions engaging in multisensory and timing tasks might explain an individual's tendency to bind multisensory information within a wide or narrow time window.
. © 2020 International Society for Autism Research and Wiley Periodicals LLCAutism Res 2021, 14: 668–680 -
Abstract A longstanding debate has surrounded the role of the motor system in speech perception, but progress in this area has been limited by tasks that only examine isolated syllables and conflate decision-making with perception. Using an adaptive task that temporally isolates perception from decision-making, we examined an EEG signature of motor activity (sensorimotor μ/beta suppression) during the perception of auditory phonemes, auditory words, audiovisual words, and environmental sounds while holding difficulty constant at two levels (Easy/Hard). Results revealed left-lateralized sensorimotor μ/beta suppression that was related to perception of speech but not environmental sounds. Audiovisual word and phoneme stimuli showed enhanced left sensorimotor μ/beta suppression for correct relative to incorrect trials, while auditory word stimuli showed enhanced suppression for incorrect trials. Our results demonstrate that motor involvement in perception is left-lateralized, is specific to speech stimuli, and it not simply the result of domain-general processes. These results provide evidence for an interactive network for speech perception in which dorsal stream motor areas are dynamically engaged during the perception of speech depending on the characteristics of the speech signal. Crucially, this motor engagement has different effects on the perceptual outcome depending on the lexicality and modality of the speech stimulus.
-
Bizley, Jennifer K. (Ed.)
Hearing one’s own voice is critical for fluent speech production as it allows for the detection and correction of vocalization errors in real time. This behavior known as the auditory feedback control of speech is impaired in various neurological disorders ranging from stuttering to aphasia; however, the underlying neural mechanisms are still poorly understood. Computational models of speech motor control suggest that, during speech production, the brain uses an efference copy of the motor command to generate an internal estimate of the speech output. When actual feedback differs from this internal estimate, an error signal is generated to correct the internal estimate and update necessary motor commands to produce intended speech. We were able to localize the auditory error signal using electrocorticographic recordings from neurosurgical participants during a delayed auditory feedback (DAF) paradigm. In this task, participants hear their voice with a time delay as they produced words and sentences (similar to an echo on a conference call), which is well known to disrupt fluency by causing slow and stutter-like speech in humans. We observed a significant response enhancement in auditory cortex that scaled with the duration of feedback delay, indicating an auditory speech error signal. Immediately following auditory cortex, dorsal precentral gyrus (dPreCG), a region that has not been implicated in auditory feedback processing before, exhibited a markedly similar response enhancement, suggesting a tight coupling between the 2 regions. Critically, response enhancement in dPreCG occurred only during articulation of long utterances due to a continuous mismatch between produced speech and reafferent feedback. These results suggest that dPreCG plays an essential role in processing auditory error signals during speech production to maintain fluency.
-
Adults struggle to learn non-native speech categories in many experimental settings (Goto, 1971), but learn efficiently in a video game paradigm where non-native speech sounds have functional significance (Lim and Holt, 2011). Behavioral and neural evidence from this and other paradigms point toward the involvement of reinforcement learning mechanisms in speech category learning. We formalize this hypothesis computationally and present two simulations. The first simulates the findings of Lim et al. (2019), providing proof in principle that a reinforcement learning algorithm can successfully capture human results in a video game where people are learning novel categories of noise tokens. Our second simulation extends this to speech sounds and demonstrates that our algorithm mimics second language learners’ improvement on discrimination of a non-native speech contrast. Together these two simulations show that reinforcement learning provides an accurate model of human learning in this paradigm and provide evidence supporting the hypothesis that this mechanism could play a key role in effective speech category learning in adults. Being able to identify the algorithms employed in this paradigm could provide many avenues for pedagogical changes in second language learning and let teachers harness the processes that allow for efficient learning and improvement of non-native perceptual ability.more » « less