Primary auditory cortex is a critical stage in the human auditory pathway, a gateway between subcortical and higher-level cortical areas. Receiving the output of all subcortical processing, it sends its output on to higher-level cortex. Non-invasive physiological recordings of primary auditory cortex using electroencephalography (EEG) and magnetoencephalography (MEG), however, may not have sufficient specificity to separate responses generated in primary auditory cortex from those generated in underlying subcortical areas or neighboring cortical areas. This limitation is important for investigations of effects of top-down processing (e.g., selective-attention-based) on primary auditory cortex: higher-level areas are known to be strongly influenced by top-down processes, but subcortical areas are often assumed to perform strictly bottom-up processing. Fortunately, recent advances have made it easier to isolate the neural activity of primary auditory cortex from other areas. In this perspective, we focus on time-locked responses to stimulus features in the high gamma band (70–150 Hz) and with early cortical latency (∼40 ms), intermediate between subcortical and higher-level areas. We review recent findings from physiological studies employing either repeated simple sounds or continuous speech, obtaining either a frequency following response (FFR) or temporal response function (TRF). The potential roles of top-down processing are underscored, and comparisons with invasive intracranial EEG (iEEG) and animal model recordings are made. We argue that MEG studies employing continuous speech stimuli may offer particular benefits, in that only a few minutes of speech generates robust high gamma responses from bilateral primary auditory cortex, and without measurable interference from subcortical or higher-level areas.
more »
« less
Subcortical responses to music and speech are alike while cortical responses diverge
Music and speech are encountered daily and are unique to human beings. Both are transformed by the auditory pathway from an initial acoustical encoding to higher level cognition. Studies of cortex have revealed distinct brain responses to music and speech, but differences may emerge in the cortex or may be inherited from different subcortical encoding. In the first part of this study, we derived the human auditory brainstem response (ABR), a measure of subcortical encoding, to recorded music and speech using two analysis methods. The first method, described previously and acoustically based, yielded very different ABRs between the two sound classes. The second method, however, developed here and based on a physiological model of the auditory periphery, gave highly correlated responses to music and speech. We determined the superiority of the second method through several metrics, suggesting there is no appreciable impact of stimulus class (i.e., music vs speech) on the way stimulus acoustics are encoded subcortically. In this study’s second part, we considered the cortex. Our new analysis method resulted in cortical music and speech responses becoming more similar but with remaining differences. The subcortical and cortical results taken together suggest that there is evidence for stimulus-class dependent processing of music and speech at the cortical but not subcortical level.
more »
« less
- Award ID(s):
- 2142612
- PAR ID:
- 10523884
- Publisher / Repository:
- Scientific Reports
- Date Published:
- Journal Name:
- Scientific Reports
- Volume:
- 14
- Issue:
- 1
- ISSN:
- 2045-2322
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Little is known about how neural representations of natural sounds differ across species. For example, speech and music play a unique role in human hearing, yet it is unclear how auditory representations of speech and music differ between humans and other animals. Using functional ultrasound imaging, we measured responses in ferrets to a set of natural and spectrotemporally matched synthetic sounds previously tested in humans. Ferrets showed similar lower-level frequency and modulation tuning to that observed in humans. But while humans showed substantially larger responses to natural vs. synthetic speech and music in non-primary regions, ferret responses to natural and synthetic sounds were closely matched throughout primary and non-primary auditory cortex, even when tested with ferret vocalizations. This finding reveals that auditory representations in humans and ferrets diverge sharply at late stages of cortical processing, potentially driven by higher-order processing demands in speech and music.more » « less
-
Kumar, Arvind (Ed.)Characterizing neuronal responses to natural stimuli remains a central goal in sensory neuroscience. In auditory cortical neurons, the stimulus selectivity of elicited spiking activity is summarized by a spectrotemporal receptive field (STRF) that relates neuronal responses to the stimulus spectrogram. Though effective in characterizing primary auditory cortical responses, STRFs of non-primary auditory neurons can be quite intricate, reflecting their mixed selectivity. The complexity of non-primary STRFs hence impedes understanding how acoustic stimulus representations are transformed along the auditory pathway. Here, we focus on the relationship between ferret primary auditory cortex (A1) and a secondary region, dorsal posterior ectosylvian gyrus (PEG). We propose estimating receptive fields in PEG with respect to a well-established high-dimensional computational model of primary-cortical stimulus representations. These “cortical receptive fields” (CortRF) are estimated greedily to identify the salient primary-cortical features modulating spiking responses and in turn related to corresponding spectrotemporal features. Hence, they provide biologically plausible hierarchical decompositions of STRFs in PEG. Such CortRF analysis was applied to PEG neuronal responses to speech and temporally orthogonal ripple combination (TORC) stimuli and, for comparison, to A1 neuronal responses. CortRFs of PEG neurons captured their selectivity to more complex spectrotemporal features than A1 neurons; moreover, CortRF models were more predictive of PEG (but not A1) responses to speech. Our results thus suggest that secondary-cortical stimulus representations can be computed as sparse combinations of primary-cortical features that facilitate encoding natural stimuli. Thus, by adding the primary-cortical representation, we can account for PEG single-unit responses to natural sounds better than bypassing it and considering as input the auditory spectrogram. These results confirm with explicit details the presumed hierarchical organization of the auditory cortex.more » « less
-
null (Ed.)Aging is associated with an exaggerated representation of the speech envelope in auditory cortex. The relationship between this age-related exaggerated response and a listener’s ability to understand speech in noise remains an open question. Here, information-theory-based analysis methods are applied to magnetoencephalography recordings of human listeners, investigating their cortical responses to continuous speech, using the novel nonlinear measure of phase-locked mutual information between the speech stimuli and cortical responses. The cortex of older listeners shows an exaggerated level of mutual information, compared with younger listeners, for both attended and unattended speakers. The mutual information peaks for several distinct latencies: early (∼50 ms), middle (∼100 ms), and late (∼200 ms). For the late component, the neural enhancement of attended over unattended speech is affected by stimulus signal-to-noise ratio, but the direction of this dependency is reversed by aging. Critically, in older listeners and for the same late component, greater cortical exaggeration is correlated with decreased behavioral inhibitory control. This negative correlation also carries over to speech intelligibility in noise, where greater cortical exaggeration in older listeners is correlated with worse speech intelligibility scores. Finally, an age-related lateralization difference is also seen for the ∼100 ms latency peaks, where older listeners show a bilateral response compared with younger listeners’ right lateralization. Thus, this information-theory-based analysis provides new, and less coarse-grained, results regarding age-related change in auditory cortical speech processing, and its correlation with cognitive measures, compared with related linear measures. NEW & NOTEWORTHY Cortical representations of natural speech are investigated using a novel nonlinear approach based on mutual information. Cortical responses, phase-locked to the speech envelope, show an exaggerated level of mutual information associated with aging, appearing at several distinct latencies (∼50, ∼100, and ∼200 ms). Critically, for older listeners only, the ∼200 ms latency response components are correlated with specific behavioral measures, including behavioral inhibition and speech comprehension.more » « less
-
Perception can be highly dependent on stimulus context, but whether and how sensory areas encode the context remains uncertain. We used an ambiguous auditory stimulus – a tritone pair – to investigate the neural activity associated with a preceding contextual stimulus that strongly influenced the tritone pair’s perception: either as an ascending or a descending step in pitch. We recorded single-unit responses from a population of auditory cortical cells in awake ferrets listening to the tritone pairs preceded by the contextual stimulus. We find that the responses adapt locally to the contextual stimulus, consistent with human MEG recordings from the auditory cortex under the same conditions. Decoding the population responses demonstrates that cells responding to pitch-changes are able to predict well the context-sensitive percept of the tritone pairs. Conversely, decoding the individual pitch representations and taking their distance in the circular Shepard tone space predicts theoppositeof the percept. The various percepts can be readily captured and explained by a neural model of cortical activity based on populations of adapting, pitch and pitch-direction cells, aligned with the neurophysiological responses. Together, these decoding and model results suggest that contextual influences on perception may well be already encoded at the level of the primary sensory cortices, reflecting basic neural response properties commonly found in these areas.more » « less
An official website of the United States government

