skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Binding the Acoustic Features of an Auditory Source through Temporal Coherence
Abstract Numerous studies have suggested that the perception of a target sound stream (or source) can only be segregated from a complex acoustic background mixture if the acoustic features underlying its perceptual attributes (e.g., pitch, location, and timbre) induce temporally modulated responses that are mutually correlated (or coherent), and that are uncorrelated (incoherent) from those of other sources in the mixture. This “temporal coherence” hypothesis asserts that attentive listening to one acoustic feature of a target enhances brain responses to that feature but would also concomitantly (1) induce mutually excitatory influences with other coherently responding neurons, thus enhancing (or binding) them all as they respond to the attended source; by contrast, (2) suppressive interactions are hypothesized to build up among neurons driven by temporally incoherent sound features, thus relatively reducing their activity. In this study, we report on EEG measurements in human subjects engaged in various sound segregation tasks that demonstrate rapid binding among the temporally coherent features of the attended source regardless of their identity (pure tone components, tone complexes, or noise), harmonic relationship, or frequency separation, thus confirming the key role temporal coherence plays in the analysis and organization of auditory scenes.  more » « less
Award ID(s):
1764010
PAR ID:
10356204
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Cerebral Cortex Communications
Volume:
2
Issue:
4
ISSN:
2632-7376
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract It has been suggested that the visual system samples attended information rhythmically. Does rhythmic sampling also apply to distracting information? How do attended information and distracting information compete temporally for neural representations? We recorded electroencephalography from participants who detected instances of coherent motion in a random dot kinematogram (RDK; the target stimulus), overlayed on different categories (pleasant, neutral, and unpleasant) of affective images from the International Affective System (IAPS) (the distractor). The moving dots were flickered at 4.29 Hz whereas the IAPS pictures were flickered at 6 Hz. The time course of spectral power at 4.29 Hz (dot response) was taken to index the temporal dynamics of target processing. The spatial pattern of the power at 6 Hz was similarly extracted and subjected to a MVPA decoding analysis to index the temporal dynamics of processing pleasant, neutral, or unpleasant distractor pictures. We found that (1) both target processing and distractor processing exhibited rhythmicity at ∼1 Hz and (2) the phase difference between the two rhythmic time courses were related to task performance, i.e., relative phase closer to π predicted a higher rate of coherent motion detection whereas relative phase closer to 0 predicted a lower rate of coherent motion detection. These results suggest that (1) in a target-distractor scenario, both attended and distracting information were sampled rhythmically and (2) the more target sampling and distractor sampling were separated in time within a sampling cycle, the less distraction effects were observed, both at the neural and the behavioral level. 
    more » « less
  2. The concept of stimulus feature tuning isfundamental to neuroscience. Cortical neurons acquire their feature-tuning properties by learning from experience and using proxy signs of tentative features’ potential usefulness that come from the spatial and/or temporal context in which these features occur. According to this idea, local but ultimately behaviorally useful features should be the ones that are predictably related to other such features either preceding them in time or taking place side-by-side with them. Inspired by this idea, in this paper, deep neural networks are combined with Canonical Correlation Analysis (CCA) for feature extraction and the power of the features is demonstrated using unsupervised cross-modal prediction tasks. CCA is a multi-view feature extraction method that finds correlated features across multiple datasets (usually referred to as views or modalities). CCA finds linear transformations of each view such that the extracted principal components, or features, have a maximal mutual correlation. CCA is a linear method, and the features are computed by a weighted sum of each view's variables. Once the weights are learned, CCA can be applied to new examples and used for cross-modal prediction by inferring the target-view features of an example from its given variables in a source (query) view. To test the proposed method, it was applied to the unstructured CIFAR-100 dataset of 60,000 images categorized into 100 classes, which are further grouped into 20 superclasses and used to demonstrate the mining of image-tag correlations. CCA was performed on the outputs of three pre-trained CNNs: AlexNet, ResNet, and VGG. Taking advantage of the mutually correlated features extracted with CCA, a search for nearest neighbors was performed in the canonical subspace common to both the query and the target views to retrieve the most matching examples in the target view, which successfully predicted the superclass membership of the tested views without any supervised training. 
    more » « less
  3. The discrimination of complex sounds is a fundamental function of the auditory system. This operation must be robust in the presence of noise and acoustic clutter. Echolocating bats are auditory specialists that discriminate sonar objects in acoustically complex environments. Bats produce brief signals, interrupted by periods of silence, rendering echo snapshots of sonar objects. Sonar object discrimination requires that bats process spatially and temporally overlapping echoes to make split-second decisions. The mechanisms that enable this discrimination are not well understood, particularly in complex environments. We explored the neural underpinnings of sonar object discrimination in the presence of acoustic scattering caused by physical clutter. We performed electrophysiological recordings in the inferior colliculus of awake big brown bats, to broadcasts of prerecorded echoes from physical objects. We acquired single unit responses to echoes and discovered a subpopulation of IC neurons that encode acoustic features that can be used to discriminate between sonar objects. We further investigated the effects of environmental clutter on this population’s encoding of acoustic features. We discovered that the effect of background clutter on sonar object discrimination is highly variable and depends on object properties and target-clutter spatiotemporal separation. In many conditions, clutter impaired discrimination of sonar objects. However, in some instances clutter enhanced acoustic features of echo returns, enabling higher levels of discrimination. This finding suggests that environmental clutter may augment acoustic cues used for sonar target discrimination and provides further evidence in a growing body of literature that noise is not universally detrimental to sensory encoding. 
    more » « less
  4. Little is known about the neural mechanisms that mediate differential action–selection responses to communication and echolocation calls in bats. For example, in the big brown bat, frequency modulated (FM) food-claiming communication calls closely resemble FM echolocation calls, which guide social and orienting behaviors, respectively. Using advanced signal processing methods, we identified fine differences in temporal structure of these natural sounds that appear key to auditory discrimination and behavioral decisions. We recorded extracellular potentials from single neurons in the midbrain inferior colliculus (IC) of passively listening animals, and compared responses to playbacks of acoustic signals used by bats for social communication and echolocation. We combined information obtained from spike number and spike triggered averages (STA) to reveal a robust classification of neuron selectivity for communication or echolocation calls. These data highlight the importance of temporal acoustic structure for differentiating echolocation and food-claiming social calls and point to general mechanisms of natural sound processing across species. 
    more » « less
  5. Temporal analysis of sound is fundamental to auditory processing throughout the animal kingdom. Echolocating bats are powerful models for investigating the underlying mechanisms of auditory temporal processing, as they show microsecond precision in discriminating the timing of acoustic events. However, the neural basis for microsecond auditory discrimination in bats has eluded researchers for decades. Combining extracellular recordings in the midbrain inferior colliculus (IC) and mathematical modeling, we show that microsecond precision in registering stimulus events emerges from synchronous neural firing, revealed through low-latency variability of stimulus-evoked extracellular field potentials (EFPs, 200–600 Hz). The temporal precision of the EFP increases with the number of neurons firing in synchrony. Moreover, there is a functional relationship between the temporal precision of the EFP and the spectrotemporal features of the echolocation calls. In addition, EFP can measure the time difference of simulated echolocation call–echo pairs with microsecond precision. We propose that synchronous firing of populations of neurons operates in diverse species to support temporal analysis for auditory localization and complex sound processing. 
    more » « less