skip to main content


Title: Differences in temporal processing speeds between the right and left auditory cortex reflect the strength of recurrent synaptic connectivity
Brain asymmetry in the sensitivity to spectrotemporal modulation is an established functional feature that underlies the perception of speech and music. The left auditory cortex (ACx) is believed to specialize in processing fast temporal components of speech sounds, and the right ACx slower components. However, the circuit features and neural computations behind these lateralized spectrotemporal processes are poorly understood. To answer these mechanistic questions we use mice, an animal model that captures some relevant features of human communication systems. In this study, we screened for circuit features that could subserve temporal integration differences between the left and right ACx. We mapped excitatory input to principal neurons in all cortical layers and found significantly stronger recurrent connections in the superficial layers of the right ACx compared to the left. We hypothesized that the underlying recurrent neural dynamics would exhibit differential characteristic timescales corresponding to their hemispheric specialization. To investigate, we recorded spike trains from awake mice and estimated the network time constants using a statistical method to combine evidence from multiple weak signal-to-noise ratio neurons. We found longer temporal integration windows in the superficial layers of the right ACx compared to the left as predicted by stronger recurrent excitation. Our study shows substantial evidence linking stronger recurrent synaptic connections to longer network timescales. These findings support speech processing theories that purport asymmetry in temporal integration is a crucial feature of lateralization in auditory processing.  more » « less
Award ID(s):
1652774
NSF-PAR ID:
10453120
Author(s) / Creator(s):
; ; ; ; ;
Editor(s):
Bizley, Jennifer K.
Date Published:
Journal Name:
PLOS Biology
Volume:
20
Issue:
10
ISSN:
1545-7885
Page Range / eLocation ID:
e3001803
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Lay summary

    Individuals with ASD and schizophrenia are more likely to perceive asynchronous auditory and visual events as occurring simultaneously even if they are well separated in time. We investigated whether similar difficulties in audiovisual temporal processing were present in subclinical populations with high autistic and schizotypal traits. We found that the ability to detect audiovisual asynchrony was not affected by different levels of autistic and schizotypal traits. We also found that connectivity of some brain regions engaging in multisensory and timing tasks might explain an individual's tendency to bind multisensory information within a wide or narrow time window.Autism Res2021, 14: 668–680. © 2020 International Society for Autism Research and Wiley Periodicals LLC

     
    more » « less
  2. Abstract

    Modulation of vocal pitch is a key speech feature that conveys important linguistic and affective information. Auditory feedback is used to monitor and maintain pitch. We examined induced neural high gamma power (HGP) (65–150 Hz) using magnetoencephalography during pitch feedback control. Participants phonated into a microphone while hearing their auditory feedback through headphones. During each phonation, a single real‐time 400 ms pitch shift was applied to the auditory feedback. Participants compensated by rapidly changing their pitch to oppose the pitch shifts. This behavioral change required coordination of the neural speech motor control network, including integration of auditory and somatosensory feedback to initiate change in motor plans. We found increases in HGP across both hemispheres within 200 ms of pitch shifts, covering left sensory and right premotor, parietal, temporal, and frontal regions, involved in sensory detection and processing of the pitch shift. Later responses to pitch shifts (200–300 ms) were right dominant, in parietal, frontal, and temporal regions. Timing of activity in these regions indicates their role in coordinating motor change and detecting and processing of the sensory consequences of this change. Subtracting out cortical responses during passive listening to recordings of the phonations isolated HGP increases specific to speech production, highlighting right parietal and premotor cortex, and left posterior temporal cortex involvement in the motor response. Correlation of HGP with behavioral compensation demonstrated right frontal region involvement in modulating participant's compensatory response. This study highlights the bihemispheric sensorimotor cortical network involvement in auditory feedback‐based control of vocal pitch.Hum Brain Mapp 37:1474‐1485, 2016. © 2016 Wiley Periodicals, Inc.

     
    more » « less
  3. Abstract

    A fundamental question in neurolinguistics concerns the brain regions involved in syntactic and semantic processing during speech comprehension, both at the lexical (word processing) and supra-lexical levels (sentence and discourse processing). To what extent are these regions separated or intertwined? To address this question, we introduce a novel approach exploiting neural language models to generate high-dimensional feature sets that separately encode semantic and syntactic information. More precisely, we train a lexical language model, GloVe, and a supra-lexical language model, GPT-2, on a text corpus from which we selectively removed either syntactic or semantic information. We then assess to what extent the features derived from these information-restricted models are still able to predict the fMRI time courses of humans listening to naturalistic text. Furthermore, to determine the windows of integration of brain regions involved in supra-lexical processing, we manipulate the size of contextual information provided to GPT-2. The analyses show that, while most brain regions involved in language comprehension are sensitive to both syntactic and semantic features, the relative magnitudes of these effects vary across these regions. Moreover, regions that are best fitted by semantic or syntactic features are more spatially dissociated in the left hemisphere than in the right one, and the right hemisphere shows sensitivity to longer contexts than the left. The novelty of our approach lies in the ability to control for the information encoded in the models’ embeddings by manipulating the training set. These “information-restricted” models complement previous studies that used language models to probe the neural bases of language, and shed new light on its spatial organization.

     
    more » « less
  4. Speech activity detection (SAD) is a key pre-processing step for a speech-based system. The performance of conventional audio-only SAD (A-SAD) systems is impaired by acoustic noise when they are used in practical applications. An alternative approach to address this problem is to include visual information, creating audiovisual speech activity detection (AV-SAD) solutions. In our previous work, we proposed to build an AV-SAD system using bimodal recurrent neural network (BRNN). This framework was able to capture the task-related characteristics in the audio and visual inputs, and model the temporal information within and across modalities. The approach relied on long short-term memory (LSTM). Although LSTM can model longer temporal dependencies with the cells, the effective memory of the units is limited to a few frames, since the recurrent connection only considers the previous frame. For SAD systems, it is important to model longer temporal dependencies to capture the semi-periodic nature of speech conveyed in acoustic and orofacial features. This study proposes to implement a BRNN-based AV-SAD system with advanced LSTMs (A-LSTMs), which overcomes this limitation by including multiple connections to frames in the past. The results show that the proposed framework can significantly outperform the BRNN system trained with the original LSTM layers. 
    more » « less
  5. Speech activity detection (SAD) is a key pre-processing step for a speech-based system. The performance of conventional audio-only SAD (A-SAD) systems is impaired by acoustic noise when they are used in practical applications. An alternative approach to address this problem is to include visual information, creating audiovisual speech activity detection (AV-SAD) solutions. In our previous work, we proposed to build an AV-SAD system using bimodal recurrent neural network (BRNN). This framework was able to capture the task-related characteristics in the audio and visual inputs, and model the temporal information within and across modalities. The approach relied on long short-term memory (LSTM). Although LSTM can model longer temporal dependencies with the cells, the effective memory of the units is limited to a few frames, since the recurrent connection only considers the previous frame. For SAD systems, it is important to model longer temporal dependencies to capture the semi-periodic nature of speech conveyed in acoustic and orofacial features. This study proposes to implement a BRNN-based AV-SAD system with advanced LSTMs (A-LSTMs), which overcomes this limitation by including multiple connections to frames in the past. The results show that the proposed framework can significantly outperform the BRNN system trained with the original LSTM layers. 
    more » « less