skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Binding the Acoustic Features of an Auditory Source through Temporal Coherence
Abstract Numerous studies have suggested that the perception of a target sound stream (or source) can only be segregated from a complex acoustic background mixture if the acoustic features underlying its perceptual attributes (e.g., pitch, location, and timbre) induce temporally modulated responses that are mutually correlated (or coherent), and that are uncorrelated (incoherent) from those of other sources in the mixture. This “temporal coherence” hypothesis asserts that attentive listening to one acoustic feature of a target enhances brain responses to that feature but would also concomitantly (1) induce mutually excitatory influences with other coherently responding neurons, thus enhancing (or binding) them all as they respond to the attended source; by contrast, (2) suppressive interactions are hypothesized to build up among neurons driven by temporally incoherent sound features, thus relatively reducing their activity. In this study, we report on EEG measurements in human subjects engaged in various sound segregation tasks that demonstrate rapid binding among the temporally coherent features of the attended source regardless of their identity (pure tone components, tone complexes, or noise), harmonic relationship, or frequency separation, thus confirming the key role temporal coherence plays in the analysis and organization of auditory scenes.  more » « less
Award ID(s):
1764010
PAR ID:
10356204
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Cerebral Cortex Communications
Volume:
2
Issue:
4
ISSN:
2632-7376
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract It has been suggested that the visual system samples attended information rhythmically. Does rhythmic sampling also apply to distracting information? How do attended information and distracting information compete temporally for neural representations? We recorded electroencephalography from participants who detected instances of coherent motion in a random dot kinematogram (RDK; the target stimulus), overlayed on different categories (pleasant, neutral, and unpleasant) of affective images from the International Affective System (IAPS) (the distractor). The moving dots were flickered at 4.29 Hz whereas the IAPS pictures were flickered at 6 Hz. The time course of spectral power at 4.29 Hz (dot response) was taken to index the temporal dynamics of target processing. The spatial pattern of the power at 6 Hz was similarly extracted and subjected to a MVPA decoding analysis to index the temporal dynamics of processing pleasant, neutral, or unpleasant distractor pictures. We found that (1) both target processing and distractor processing exhibited rhythmicity at ∼1 Hz and (2) the phase difference between the two rhythmic time courses were related to task performance, i.e., relative phase closer to π predicted a higher rate of coherent motion detection whereas relative phase closer to 0 predicted a lower rate of coherent motion detection. These results suggest that (1) in a target-distractor scenario, both attended and distracting information were sampled rhythmically and (2) the more target sampling and distractor sampling were separated in time within a sampling cycle, the less distraction effects were observed, both at the neural and the behavioral level. 
    more » « less
  2. The concept of stimulus feature tuning isfundamental to neuroscience. Cortical neurons acquire their feature-tuning properties by learning from experience and using proxy signs of tentative features’ potential usefulness that come from the spatial and/or temporal context in which these features occur. According to this idea, local but ultimately behaviorally useful features should be the ones that are predictably related to other such features either preceding them in time or taking place side-by-side with them. Inspired by this idea, in this paper, deep neural networks are combined with Canonical Correlation Analysis (CCA) for feature extraction and the power of the features is demonstrated using unsupervised cross-modal prediction tasks. CCA is a multi-view feature extraction method that finds correlated features across multiple datasets (usually referred to as views or modalities). CCA finds linear transformations of each view such that the extracted principal components, or features, have a maximal mutual correlation. CCA is a linear method, and the features are computed by a weighted sum of each view's variables. Once the weights are learned, CCA can be applied to new examples and used for cross-modal prediction by inferring the target-view features of an example from its given variables in a source (query) view. To test the proposed method, it was applied to the unstructured CIFAR-100 dataset of 60,000 images categorized into 100 classes, which are further grouped into 20 superclasses and used to demonstrate the mining of image-tag correlations. CCA was performed on the outputs of three pre-trained CNNs: AlexNet, ResNet, and VGG. Taking advantage of the mutually correlated features extracted with CCA, a search for nearest neighbors was performed in the canonical subspace common to both the query and the target views to retrieve the most matching examples in the target view, which successfully predicted the superclass membership of the tested views without any supervised training. 
    more » « less
  3. Kumar, Arvind (Ed.)
    Characterizing neuronal responses to natural stimuli remains a central goal in sensory neuroscience. In auditory cortical neurons, the stimulus selectivity of elicited spiking activity is summarized by a spectrotemporal receptive field (STRF) that relates neuronal responses to the stimulus spectrogram. Though effective in characterizing primary auditory cortical responses, STRFs of non-primary auditory neurons can be quite intricate, reflecting their mixed selectivity. The complexity of non-primary STRFs hence impedes understanding how acoustic stimulus representations are transformed along the auditory pathway. Here, we focus on the relationship between ferret primary auditory cortex (A1) and a secondary region, dorsal posterior ectosylvian gyrus (PEG). We propose estimating receptive fields in PEG with respect to a well-established high-dimensional computational model of primary-cortical stimulus representations. These “cortical receptive fields” (CortRF) are estimated greedily to identify the salient primary-cortical features modulating spiking responses and in turn related to corresponding spectrotemporal features. Hence, they provide biologically plausible hierarchical decompositions of STRFs in PEG. Such CortRF analysis was applied to PEG neuronal responses to speech and temporally orthogonal ripple combination (TORC) stimuli and, for comparison, to A1 neuronal responses. CortRFs of PEG neurons captured their selectivity to more complex spectrotemporal features than A1 neurons; moreover, CortRF models were more predictive of PEG (but not A1) responses to speech. Our results thus suggest that secondary-cortical stimulus representations can be computed as sparse combinations of primary-cortical features that facilitate encoding natural stimuli. Thus, by adding the primary-cortical representation, we can account for PEG single-unit responses to natural sounds better than bypassing it and considering as input the auditory spectrogram. These results confirm with explicit details the presumed hierarchical organization of the auditory cortex. 
    more » « less
  4. The discrimination of complex sounds is a fundamental function of the auditory system. This operation must be robust in the presence of noise and acoustic clutter. Echolocating bats are auditory specialists that discriminate sonar objects in acoustically complex environments. Bats produce brief signals, interrupted by periods of silence, rendering echo snapshots of sonar objects. Sonar object discrimination requires that bats process spatially and temporally overlapping echoes to make split-second decisions. The mechanisms that enable this discrimination are not well understood, particularly in complex environments. We explored the neural underpinnings of sonar object discrimination in the presence of acoustic scattering caused by physical clutter. We performed electrophysiological recordings in the inferior colliculus of awake big brown bats, to broadcasts of prerecorded echoes from physical objects. We acquired single unit responses to echoes and discovered a subpopulation of IC neurons that encode acoustic features that can be used to discriminate between sonar objects. We further investigated the effects of environmental clutter on this population’s encoding of acoustic features. We discovered that the effect of background clutter on sonar object discrimination is highly variable and depends on object properties and target-clutter spatiotemporal separation. In many conditions, clutter impaired discrimination of sonar objects. However, in some instances clutter enhanced acoustic features of echo returns, enabling higher levels of discrimination. This finding suggests that environmental clutter may augment acoustic cues used for sonar target discrimination and provides further evidence in a growing body of literature that noise is not universally detrimental to sensory encoding. 
    more » « less
  5. Little is known about the neural mechanisms that mediate differential action–selection responses to communication and echolocation calls in bats. For example, in the big brown bat, frequency modulated (FM) food-claiming communication calls closely resemble FM echolocation calls, which guide social and orienting behaviors, respectively. Using advanced signal processing methods, we identified fine differences in temporal structure of these natural sounds that appear key to auditory discrimination and behavioral decisions. We recorded extracellular potentials from single neurons in the midbrain inferior colliculus (IC) of passively listening animals, and compared responses to playbacks of acoustic signals used by bats for social communication and echolocation. We combined information obtained from spike number and spike triggered averages (STA) to reveal a robust classification of neuron selectivity for communication or echolocation calls. These data highlight the importance of temporal acoustic structure for differentiating echolocation and food-claiming social calls and point to general mechanisms of natural sound processing across species. 
    more » « less