Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
Understanding neural function often requires multiple modalities of data, including electrophysiogical data, imaging techniques, and demographic surveys. In this paper, we introduce a novel neurophysiological model to tackle major challenges in modeling multimodal data. First, we avoid non-alignment issues between raw signals and extracted, frequency-domain features by addressing the issue of variable sampling rates. Second, we encode modalities through “cross-attention” with other modalities. Lastly, we utilize properties of our parent transformer architecture to model long-range dependencies between segments across modalities and assess intermediary weights to better understand how source signals affect prediction. We apply our Multimodal Neurophysiological Transformer (MNT) to predict valence and arousal in an existing open-source dataset. Experiments on non-aligned multimodal time-series show that our model performs similarly and, in some cases, outperforms existing methods in classification tasks. In addition, qualitative analysis suggests that MNT is able to model neural influences on autonomic activity in predicting arousal. Our architecture has the potential to be fine-tuned to a variety of downstream tasks, including for BCI systems.Free, publicly-accessible full text available July 11, 2023
Abstract Objective.Reorienting is central to how humans direct attention to different stimuli in their environment. Previous studies typically employ well-controlled paradigms with limited eye and head movements to study the neural and physiological processes underlying attention reorienting. Here, we aim to better understand the relationship between gaze and attention reorienting using a naturalistic virtual reality (VR)-based target detection paradigm. Approach.Subjects were navigated through a city and instructed to count the number of targets that appeared on the street. Subjects performed the task in a fixed condition with no head movement and in a free condition where head movements were allowed. Electroencephalography (EEG), gaze and pupil data were collected. To investigate how neural and physiological reorienting signals are distributed across different gaze events, we used hierarchical discriminant component analysis (HDCA) to identify EEG and pupil-based discriminating components. Mixed-effects general linear models (GLM) were used to determine the correlation between these discriminating components and the different gaze events time. HDCA was also used to combine EEG, pupil and dwell time signals to classify reorienting events. Main results.In both EEG and pupil, dwell time contributes most significantly to the reorienting signals. However, when dwell times were orthogonalized against other gaze events, the distributions of themore »
There is increasing interest in how the pupil dynamics of the eye reflect underlying cognitive processes and brain states. Problematic, however, is that pupil changes can be due to non-cognitive factors, for example luminance changes in the environment, accommodation and movement. In this paper we consider how by modeling the response of the pupil in real-world environments we can capture the non-cognitive related changes and remove these to extract a residual signal which is a better index of cognition and performance. Specifically, we utilize sequence measures such as fixation position, duration, saccades, and blink-related information as inputs to a deep recurrent neural network (RNN) model for predicting subsequent pupil diameter. We build and evaluate the model for a task where subjects are watching educational videos and subsequently asked questions based on the content. Compared to commonly-used models for this task, the RNN had the lowest errors rates in predicting subsequent pupil dilation given sequence data. Most importantly was how the model output related to subjects' cognitive performance as assessed by a post-viewing test. Consistent with our hypothesis that the model captures non-cognitive pupil dynamics, we found (1) the model's root-mean square error was less for lower performing subjects than formore »