Modulation of vocal pitch is a key speech feature that conveys important linguistic and affective information. Auditory feedback is used to monitor and maintain pitch. We examined induced neural high gamma power (HGP) (65–150 Hz) using magnetoencephalography during pitch feedback control. Participants phonated into a microphone while hearing their auditory feedback through headphones. During each phonation, a single real‐time 400 ms pitch shift was applied to the auditory feedback. Participants compensated by rapidly changing their pitch to oppose the pitch shifts. This behavioral change required coordination of the neural speech motor control network, including integration of auditory and somatosensory feedback to initiate change in motor plans. We found increases in HGP across both hemispheres within 200 ms of pitch shifts, covering left sensory and right premotor, parietal, temporal, and frontal regions, involved in sensory detection and processing of the pitch shift. Later responses to pitch shifts (200–300 ms) were right dominant, in parietal, frontal, and temporal regions. Timing of activity in these regions indicates their role in coordinating motor change and detecting and processing of the sensory consequences of this change. Subtracting out cortical responses during passive listening to recordings of the phonations isolated HGP increases specific to speech production, highlighting right parietal and premotor cortex, and left posterior temporal cortex involvement in the motor response. Correlation of HGP with behavioral compensation demonstrated right frontal region involvement in modulating participant's compensatory response. This study highlights the bihemispheric sensorimotor cortical network involvement in auditory feedback‐based control of vocal pitch.
This content will become publicly available on October 17, 2024
Speech production is a complex human function requiring continuous feedforward commands together with reafferent feedback processing. These processes are carried out by distinct frontal and temporal cortical networks, but the degree and timing of their recruitment and dynamics remain poorly understood. We present a deep learning architecture that translates neural signals recorded directly from the cortex to an interpretable representational space that can reconstruct speech. We leverage learned decoding networks to disentangle feedforward vs. feedback processing. Unlike prevailing models, we find a mixed cortical architecture in which frontal and temporal networks each process both feedforward and feedback information in tandem. We elucidate the timing of feedforward and feedback–related processing by quantifying the derived receptive fields. Our approach provides evidence for a surprisingly mixed cortical architecture of speech circuitry together with decoding advances that have important implications for neural prosthetics.
more » « less- Award ID(s):
- 1912286
- NSF-PAR ID:
- 10483634
- Publisher / Repository:
- PNAS
- Date Published:
- Journal Name:
- Proceedings of the National Academy of Sciences
- Volume:
- 120
- Issue:
- 42
- ISSN:
- 0027-8424
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract Hum Brain Mapp 37:1474‐1485, 2016 . © 2016 Wiley Periodicals, Inc. -
Decoding auditory stimulus from neural activity can enable neuroprosthetics and direct communication with the brain. Some recent studies have shown successful speech decoding from intracranial recording using deep learning models. However, scarcity of training data leads to low quality speech reconstruction which prevents a complete brain-computer-interface (BCI) application. In this work, we propose a transfer learning approach with a pre-trained GAN to disentangle representation and generation layers for decoding. We first pre-train a generator to produce spectrograms from a representation space using a large corpus of natural speech data. With a small amount of paired data containing the stimulus speech and corresponding ECoG signals, we then transfer it to a bigger network with an encoder attached before, which maps the neural signal to the representation space. To further improve the network generalization ability, we introduce a Gaussian prior distribution regularizer on the latent representation during the transfer phase. With at most 150 training samples for each tested subject, we achieve a state-of-the-art decoding performance. By visualizing the attention mask embedded in the encoder, we observe brain dynamics that are consistent with findings from previous studies investigating dynamics in the superior temporal gyrus (STG), pre-central gyrus (motor) and inferior frontal gyrus (IFG). Our findings demonstrate a high reconstruction accuracy using deep learning networks together with the potential to elucidate interactions across different brain regions during a cognitive task.more » « less
-
Abstract We build on the existing biased competition view to argue that attention is an
emergent property of neural computations within and across hierarchically embedded and structurally connected cortical pathways. Critically then, one must ask,what is attention emergent from ? Within this framework, developmental changes in the quality of sensory input and feedforward‐feedback information flow shape the emergence and efficiency of attention. Several gradients of developing structural and functional cortical architecture across the caudal‐to‐rostral axis provide the substrate for attention to emerge. Neural activity within visual areas depends on neuronal density, receptive field size, tuning properties of neurons, and the location of and competition between features and objects in the visual field. These visual cortical properties highlight the information processing bottleneck attention needs to resolve. Recurrent feedforward and feedback connections convey sensory information through a series of steps at each level of the cortical hierarchy, integrating sensory information across the entire extent of the cortical hierarchy and linking sensory processing to higher‐order brain regions. Higher‐order regions concurrently provide input conveying behavioral context and goals. Thus, attention reflects the output of a series of complex biased competition neural computations that occur within and across hierarchically embedded cortical regions. Cortical development proceeds along the caudal‐to‐rostral axis, mirroring the flow in sensory information from caudal to rostral regions, and visual processing continues to develop into childhood. Examining both typical and atypical development will offer critical mechanistic insight not otherwise available in the adult stable state.This article is categorized under:
Psychology > Attention
-
Network features found in the brain may help implement more efficient and robust neural networks. Spiking neural networks (SNNs) process spikes in the spatiotemporal domain and can offer better energy efficiency than deep neural networks. However, most SNN implementations rely on simple point neurons that neglect the rich neuronal and dendritic dynamics. Herein, a bio‐inspired columnar learning network (CLN) structure that employs feedforward, lateral, and feedback connections to make robust classification with sparse data is proposed. CLN is inspired by the mammalian neocortex, comprising cortical columns each containing multiple minicolumns formed by interacting pyramidal neurons. A column continuously processes spatiotemporal signals from its sensor, while learning spatial and temporal correlations between features in different regions of an object along with the sensor's movement through sensorimotor interaction. CLN can be implemented using memristor crossbars with a local learning rule, spiking timing‐dependent plasticity (STDP), which can be natively obtained in second‐order memristors. CLN allows inputs from multiple sensors to be simultaneously processed by different columns, resulting in higher classification accuracy and better noise tolerance. Analysis of networks implemented on memristor crossbars shows that the system can operate at very low power and high throughput, with high accuracy and robustness to noise.
-
Visual scene category representations emerge very rapidly, yet the computational transformations that enable such invariant categorizations remain elusive. Deep convolutional neural networks (CNNs) perform visual categorization at near human-level accuracy using a feedforward architecture, providing neuroscientists with the opportunity to assess one successful series of representational transformations that enable categorization in silico. The goal of the current study is to assess the extent to which sequential scene category representations built by a CNN map onto those built in the human brain as assessed by high-density, time-resolved event-related potentials (ERPs). We found correspondence both over time and across the scalp: earlier (0–200 ms) ERP activity was best explained by early CNN layers at all electrodes. Although later activity at most electrode sites corresponded to earlier CNN layers, activity in right occipito-temporal electrodes was best explained by the later, fully-connected layers of the CNN around 225 ms post-stimulus, along with similar patterns in frontal electrodes. Taken together, these results suggest that the emergence of scene category representations develop through a dynamic interplay between early activity over occipital electrodes as well as later activity over temporal and frontal electrodes.more » « less