Humans are born as “universal listeners” without a bias toward any particular language. However, over the first year of life, infants’ perception is shaped by learning native speech categories. Acoustically different sounds—such as the same word produced by different speakers—come to be treated as functionally equivalent. In natural environments, these categories often emerge incidentally without overt categorization or explicit feedback. However, the neural substrates of category learning have been investigated almost exclusively using overt categorization tasks with explicit feedback about categorization decisions. Here, we examined whether the striatum, previously implicated in category learning, contributes to incidental acquisition of sound categories. In the fMRI scanner, participants played a videogame in which sound category exemplars aligned with game actions and events, allowing sound categories to incidentally support successful game play. An experimental group heard nonspeech sound exemplars drawn from coherent category spaces, whereas a control group heard acoustically similar sounds drawn from a less structured space. Although the groups exhibited similar in-game performance, generalization of sound category learning and activation of the posterior striatum were significantly greater in the experimental than control group. Moreover, the experimental group showed brain–behavior relationships related to the generalization of all categories, while in the control group these relationships were restricted to the categories with structured sound distributions. Together, these results demonstrate that the striatum, through its interactions with the left superior temporal sulcus, contributes to incidental acquisition of sound category representations emerging from naturalistic learning environments.
- Award ID(s):
- NSF-PAR ID:
- Date Published:
- Journal Name:
- Neurobiology of Language
- Page Range / eLocation ID:
- 1 to 26
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
Temporal proximity is an important clue for multisensory integration. Previous evidence indicates that individuals with autism and schizophrenia are more likely to integrate multisensory inputs over a longer temporal binding window (TBW). However, whether such deficits in audiovisual temporal integration extend to subclinical populations with high schizotypal and autistic traits are unclear. Using audiovisual simultaneity judgment (SJ) tasks for nonspeech and speech stimuli, our results suggested that the width of the audiovisual TBW was not significantly correlated with self‐reported schizotypal and autistic traits in a group of young adults. Functional magnetic resonance imaging (fMRI) resting‐state activity was also acquired to explore the neural correlates underlying inter‐individual variability of TBW width. Across the entire sample, stronger resting‐state functional connectivity (rsFC) between the left superior temporal cortex and the left precuneus, and weaker rsFC between the left cerebellum and the right dorsal lateral prefrontal cortex were correlated with a narrower TBW for speech stimuli. Meanwhile, stronger rsFC between the left anterior superior temporal gyrus and the right inferior temporal gyrus was correlated with a wider audiovisual TBW for non‐speech stimuli. The TBW‐related rsFC was not affected by levels of subclinical traits. In conclusion, this study indicates that audiovisual temporal processing may not be affected by autistic and schizotypal traits and rsFC between brain regions responding to multisensory information and timing may account for the inter‐individual difference in TBW width.
Individuals with ASD and schizophrenia are more likely to perceive asynchronous auditory and visual events as occurring simultaneously even if they are well separated in time. We investigated whether similar difficulties in audiovisual temporal processing were present in subclinical populations with high autistic and schizotypal traits. We found that the ability to detect audiovisual asynchrony was not affected by different levels of autistic and schizotypal traits. We also found that connectivity of some brain regions engaging in multisensory and timing tasks might explain an individual's tendency to bind multisensory information within a wide or narrow time window.
. © 2020 International Society for Autism Research and Wiley Periodicals LLC Autism Res2021, 14: 668–680
A longstanding debate has surrounded the role of the motor system in speech perception, but progress in this area has been limited by tasks that only examine isolated syllables and conflate decision-making with perception. Using an adaptive task that temporally isolates perception from decision-making, we examined an EEG signature of motor activity (sensorimotor μ/beta suppression) during the perception of auditory phonemes, auditory words, audiovisual words, and environmental sounds while holding difficulty constant at two levels (Easy/Hard). Results revealed left-lateralized sensorimotor μ/beta suppression that was related to perception of speech but not environmental sounds. Audiovisual word and phoneme stimuli showed enhanced left sensorimotor μ/beta suppression for correct relative to incorrect trials, while auditory word stimuli showed enhanced suppression for incorrect trials. Our results demonstrate that motor involvement in perception is left-lateralized, is specific to speech stimuli, and it not simply the result of domain-general processes. These results provide evidence for an interactive network for speech perception in which dorsal stream motor areas are dynamically engaged during the perception of speech depending on the characteristics of the speech signal. Crucially, this motor engagement has different effects on the perceptual outcome depending on the lexicality and modality of the speech stimulus.
Adults struggle to learn non-native speech categories in many experimental settings (Goto, 1971), but learn efficiently in a video game paradigm where non-native speech sounds have functional significance (Lim and Holt, 2011). Behavioral and neural evidence from this and other paradigms point toward the involvement of reinforcement learning mechanisms in speech category learning. We formalize this hypothesis computationally and present two simulations. The first simulates the findings of Lim et al. (2019), providing proof in principle that a reinforcement learning algorithm can successfully capture human results in a video game where people are learning novel categories of noise tokens. Our second simulation extends this to speech sounds and demonstrates that our algorithm mimics second language learners’ improvement on discrimination of a non-native speech contrast. Together these two simulations show that reinforcement learning provides an accurate model of human learning in this paradigm and provide evidence supporting the hypothesis that this mechanism could play a key role in effective speech category learning in adults. Being able to identify the algorithms employed in this paradigm could provide many avenues for pedagogical changes in second language learning and let teachers harness the processes that allow for efficient learning and improvement of non-native perceptual ability.more » « less
null (Ed.)Abstract Neurobiological models of speech perception posit that both left and right posterior temporal brain regions are involved in the early auditory analysis of speech sounds. However, frank deficits in speech perception are not readily observed in individuals with right hemisphere damage. Instead, damage to the right hemisphere is often associated with impairments in vocal identity processing. Herein lies an apparent paradox: The mapping between acoustics and speech sound categories can vary substantially across talkers, so why might right hemisphere damage selectively impair vocal identity processing without obvious effects on speech perception? In this review, I attempt to clarify the role of the right hemisphere in speech perception through a careful consideration of its role in processing vocal identity. I review evidence showing that right posterior superior temporal, right anterior superior temporal, and right inferior / middle frontal regions all play distinct roles in vocal identity processing. In considering the implications of these findings for neurobiological accounts of speech perception, I argue that the recruitment of right posterior superior temporal cortex during speech perception may specifically reflect the process of conditioning phonetic identity on talker information. I suggest that the relative lack of involvement of other right hemisphere regions in speech perception may be because speech perception does not necessarily place a high burden on talker processing systems, and I argue that the extant literature hints at potential subclinical impairments in the speech perception abilities of individuals with right hemisphere damage.more » « less