Abstract Speech processing often occurs amid competing inputs from other modalities, for example, listening to the radio while driving. We examined the extent to which dividing attention between auditory and visual modalities (bimodal divided attention) impacts neural processing of natural continuous speech from acoustic to linguistic levels of representation. We recorded electroencephalographic (EEG) responses when human participants performed a challenging primary visual task, imposing low or high cognitive load while listening to audiobook stories as a secondary task. The two dual-task conditions were contrasted with an auditory single-task condition in which participants attended to stories while ignoring visual stimuli. Behaviorally, the high load dual-task condition was associated with lower speech comprehension accuracy relative to the other two conditions. We fitted multivariate temporal response function encoding models to predict EEG responses from acoustic and linguistic speech features at different representation levels, including auditory spectrograms and information-theoretic models of sublexical-, word-form-, and sentence-level representations. Neural tracking of most acoustic and linguistic features remained unchanged with increasing dual-task load, despite unambiguous behavioral and neural evidence of the high load dual-task condition being more demanding. Compared to the auditory single-task condition, dual-task conditions selectively reduced neural tracking of only some acoustic and linguistic features, mainly at latencies >200 ms, while earlier latencies were surprisingly unaffected. These findings indicate that behavioral effects of bimodal divided attention on continuous speech processing occur not because of impaired early sensory representations but likely at later cognitive processing stages. Crossmodal attention-related mechanisms may not be uniform across different speech processing levels.
more »
« less
High-level cognition is supported by information-rich but compressible brain activity patterns
To efficiently yet reliably represent and process information, our brains need to produce information-rich signals that differentiate between moments or cognitive states, while also being robust to noise or corruption. For many, though not all, natural systems, these two properties are often inversely related: More information-rich signals are less robust, and vice versa. Here, we examined how these properties change with ongoing cognitive demands. To this end, we applied dimensionality reduction algorithms and pattern classifiers to functional neuroimaging data collected as participants listened to a story, temporally scrambled versions of the story, or underwent a resting state scanning session. We considered two primary aspects of the neural data recorded in these different experimental conditions. First, we treated the maximum achievable decoding accuracy across participants as an indicator of the “informativeness” of the recorded patterns. Second, we treated the number of features (components) required to achieve a threshold decoding accuracy as a proxy for the “compressibility” of the neural patterns (where fewer components indicate greater compression). Overall, we found that the peak decoding accuracy (achievable without restricting the numbers of features) was highest in the intact (unscrambled) story listening condition. However, the number of features required to achieve comparable classification accuracy was also lowest in the intact story listening condition. Taken together, our work suggests that our brain networks flexibly reconfigure according to ongoing task demands and that the activity patterns associated with higher-order cognition and high engagement are both more informative and more compressible than the activity patterns associated with lower-order tasks and lower engagement.
more »
« less
- Award ID(s):
- 2145172
- PAR ID:
- 10573074
- Editor(s):
- Breakspear, Michael
- Publisher / Repository:
- Proceedings of the National Academy of Sciences, USA
- Date Published:
- Journal Name:
- Proceedings of the National Academy of Sciences
- Edition / Version:
- 1.0
- Volume:
- 121
- Issue:
- 35
- ISSN:
- 0027-8424
- Page Range / eLocation ID:
- e2400082121
- Subject(s) / Keyword(s):
- information compression temporal decoding dimensionality reduction neuroimaging
- Format(s):
- Medium: X Size: 19.4MB Other: pdf
- Size(s):
- 19.4MB
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Neural, physiological, and behavioral signals synchronize between human subjects in a variety of settings. Multiple hypotheses have been proposed to explain this interpersonal synchrony, but there is no clarity under which conditions it arises, for which signals, or whether there is a common underlying mechanism. We hypothesized that cognitive processing of a shared stimulus is the source of synchrony between subjects, measured here as intersubject correlation (ISC). To test this, we presented informative videos to participants in an attentive and distracted condition and subsequently measured information recall. ISC was observed for electro-encephalography, gaze position, pupil size, and heart rate, but not respiration and head movements. The strength of correlation was co-modulated in the different signals, changed with attentional state, and predicted subsequent recall of information presented in the videos. There was robust within-subject coupling between brain, heart, and eyes, but not respiration or head movements. The results suggest that ISC is the result of effective cognitive processing, and thus emerges only for those signals that exhibit a robust brain–body connection. While physiological and behavioral fluctuations may be driven by multiple features of the stimulus, correlation with other individuals is co-modulated by the level of attentional engagement with the stimulus.more » « less
-
Humans engagement in music rests on underlying elements such as the listeners’ cultural background and interest in music. These factors modulate how listeners anticipate musical events, a process inducing instantaneous neural responses as the music confronts these expectations. Measuring such neural correlates would represent a direct window into high-level brain processing. Here we recorded cortical signals as participants listened to Bach melodies. We assessed the relative contributions of acoustic versus melodic components of the music to the neural signal. Melodic features included information on pitch progressions and their tempo, which were extracted from a predictive model of musical structure based on Markov chains. We related the music to brain activity with temporal response functions demonstrating, for the first time, distinct cortical encoding of pitch and note-onset expectations during naturalistic music listening. This encoding was most pronounced at response latencies up to 350 ms, and in both planum temporale and Heschl’s gyrus.more » « less
-
Listening to speech in noise can require substantial mental effort, even among younger normal-hearing adults. The task-evoked pupil response (TEPR) has been shown to track the increased effort exerted to recognize words or sentences in increasing noise. However, few studies have examined the trajectory of listening effort across longer, more natural, stretches of speech, or the extent to which expectations about upcoming listening difficulty modulate the TEPR. Seventeen younger normal-hearing adults listened to 60-s-long audiobook passages, repeated three times in a row, at two different signal-to-noise ratios (SNRs) while pupil size was recorded. There was a significant interaction between SNR, repetition, and baseline pupil size on sustained listening effort. At lower baseline pupil sizes, potentially reflecting lower attention mobilization, TEPRs were more sustained in the harder SNR condition, particularly when attention mobilization remained low by the third presentation. At intermediate baseline pupil sizes, differences between conditions were largely absent, suggesting these listeners had optimally mobilized their attention for both SNRs. Lastly, at higher baseline pupil sizes, potentially reflecting over-mobilization of attention, the effect of SNR was initially reversed for the second and third presentations: participants initially appeared to disengage in the harder SNR condition, resulting in reduced TEPRs that recovered in the second half of the story. Together, these findings suggest that the unfolding of listening effort over time depends critically on the extent to which individuals have successfully mobilized their attention in anticipation of difficult listening conditions.more » « less
-
null (Ed.)A reliable neural-machine interface is essential for humans to intuitively interact with advanced robotic hands in an unconstrained environment. Existing neural decoding approaches utilize either discrete hand gesture-based pattern recognition or continuous force decoding with one finger at a time. We developed a neural decoding technique that allowed continuous and concurrent prediction of forces of different fingers based on spinal motoneuron firing information. High-density skin-surface electromyogram (HD-EMG) signals of finger extensor muscle were recorded, while human participants produced isometric flexion forces in a dexterous manner (i.e. produced varying forces using either a single finger or multiple fingers concurrently). Motoneuron firing information was extracted from the EMG signals using a blind source separation technique, and each identified neuron was further classified to be associated with a given finger. The forces of individual fingers were then predicted concurrently by utilizing the corresponding motoneuron pool firing frequency of individual fingers. Compared with conventional approaches, our technique led to better prediction performances, i.e. a higher correlation ([Formula: see text] versus [Formula: see text]), a lower prediction error ([Formula: see text]% MVC versus [Formula: see text]% MVC), and a higher accuracy in finger state (rest/active) prediction ([Formula: see text]% versus [Formula: see text]%). Our decoding method demonstrated the possibility of classifying motoneurons for different fingers, which significantly alleviated the cross-talk issue of EMG recordings from neighboring hand muscles, and allowed the decoding of finger forces individually and concurrently. The outcomes offered a robust neural-machine interface that could allow users to intuitively control robotic hands in a dexterous manner.more » « less
An official website of the United States government

