skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on November 1, 2025

Title: fNIRSNET: A multi-view spatio-temporal convolutional neural network fusion for functional near-infrared spectroscopy-based auditory event classification
Multi-view learning is a rapidly evolving research area focused on developing diverse learning representations. In neural data analysis, this approach holds immense potential by capturing spatial, temporal, and frequency features. Despite its promise, multi-view application to functional near-infrared spectroscopy (fNIRS) has remained largely unexplored. This study addresses this gap by introducing fNIRSNET, a novel framework that generates and fuses multi-view spatio-temporal representations using convolutional neural networks. It investigates the combined informational strength of oxygenated (HbO2) and deoxygenated (HbR) hemoglobin signals, further extending these capabilities by integrating with electroencephalography (EEG) networks to achieve robust multimodal classification. Experiments involved classifying neural responses to auditory stimuli with nine healthy participants. fNIRS signals were decomposed into HbO2/HbR concentration changes, resulting in Parallel and Merged input types. We evaluated four input types across three data compositions: balanced, subject, and complete datasets. Our fNIRSNET's performance was compared with eight baseline classification models and merged it with four common EEG networks to assess the efficacy of combined features for multimodal classification. Compared to baselines, fNIRSNET using the Merged input type achieved the highest accuracy of 83.22%, 81.18%, and 91.58% for balanced, subject, and complete datasets, respectively. In the complete set, the approach effectively mitigated class imbalance issues, achieving sensitivity of 83.58% and specificity of 95.42%. Multimodal fusion of EEG networks and fNIRSNET outperformed single-modality performance with the highest accuracy of 87.15% on balanced data. Overall, this study introduces an innovative fusion approach for decoding fNIRS data and illustrates its integration with established EEG networks to enhance performance.  more » « less
Award ID(s):
2024418
PAR ID:
10544929
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
Elsevier
Date Published:
Journal Name:
Engineering Applications of Artificial Intelligence
Volume:
137
Issue:
PB
ISSN:
0952-1976
Page Range / eLocation ID:
109256
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The prospect of electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) in the presence of topological information of participants is often left unexplored in most of the brain-computer interface (BCI) systems. Additionally, the usage of these modalities together in the field of multimodality analysis to support multiple brain signals toward improving BCI performance is not fully examined. This study first presents a multimodal data fusion framework to exploit and decode the complementary synergistic properties in multimodal neural signals. Moreover, the relations among different subjects and their observations also play critical roles in classifying unknown subjects. We developed a context-aware graph neural network (GNN) model utilizing the pairwise relationship among participants to investigate the performance on an auditory task classification. We explored standard and deviant auditory EEG and fNIRS data where each subject was asked to perform an auditory oddball task and has multiple trials regarded as context-aware nodes in our graph construction. In experiments, our multimodal data fusion strategy showed an improvement up to 8.40% via SVM and 2.02% via GNN, compared to the single-modal EEG or fNIRS. In addition, our context-aware GNN achieved 5.3%, 4.07% and 4.53% higher accuracy for EEG, fNIRS and multimodal data based experiments, compared to the baseline models. 
    more » « less
  2. Applications of multimodal neuroimaging techniques, including electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) have gained prominence in recent years, and they are widely practiced in brain–computer interface (BCI) and neuro-pathological diagnosis applications. Most existing approaches assume observations are independent and identically distributed (i.i.d.), as shown in the top section of the right figure, yet ignore the difference among subjects. It has been challenging to model subject groups to maintain topological information (e.g., patient graphs) while fusing BCI signals for discriminant feature learning. In this article, we introduce a topology-aware graph-based multimodal fusion (TaGMF) framework to classify amyotrophic lateral sclerosis (ALS) and healthy subjects, illustrated in the lower section of the right image. Our framework is built on graph neural networks (GNNs) but with two unique contributions. First, a novel topology-aware graph (TaG) is proposed to model subject groups by considering: 1) intersubject; 2) intrasubject; and 3) intergroup relations. Second, the learned representation of EEG and fNIRS signals of each subject allows for explorations of different fusion strategies along with the TaGMF optimizations. Our analysis demonstrates the effectiveness of our graph-based fusion approach in multimodal classification by achieving a 22.6% performance improvement over classical approaches. 
    more » « less
  3. Multimodal data fusion is one of the current primary neuroimaging research directions to overcome the fundamental limitations of individual modalities by exploiting complementary information from different modalities. Electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) are especially compelling modalities due to their potentially complementary features reflecting the electro-hemodynamic characteristics of neural responses. However, the current multimodal studies lack a comprehensive systematic approach to properly merge the complementary features from their multimodal data. Identifying a systematic approach to properly fuse EEG-fNIRS data and exploit their complementary potential is crucial in improving performance. This paper proposes a framework for classifying fused EEG-fNIRS data at the feature level, relying on a mutual information-based feature selection approach with respect to the complementarity between features. The goal is to optimize the complementarity, redundancy and relevance between multimodal features with respect to the class labels as belonging to a pathological condition or healthy control. Nine amyotrophic lateral sclerosis (ALS) patients and nine controls underwent multimodal data recording during a visuo-mental task. Multiple spectral and temporal features were extracted and fed to a feature selection algorithm followed by a classifier, which selected the optimized subset of features through a cross-validation process. The results demonstrated considerably improved hybrid classification performance compared to the individual modalities and compared to conventional classification without feature selection, suggesting a potential efficacy of our proposed framework for wider neuro-clinical applications. 
    more » « less
  4. In this paper, we propose a deep multimodal fusion network to fuse multiple modalities (face, iris, and fingerprint) for person identification. The proposed deep multimodal fusion algorithm consists of multiple streams of modality-specific Convolutional Neural Networks (CNNs), which are jointly optimized at multiple feature abstraction levels. Multiple features are extracted at several different convolutional layers from each modality-specific CNN for joint feature fusion, optimization, and classification. Features extracted at different convolutional layers of a modality-specific CNN represent the input at several different levels of abstract representations. We demonstrate that an efficient multimodal classification can be accomplished with a significant reduction in the number of network parameters by exploiting these multi-level abstract representations extracted from all the modality-specific CNNs. We demonstrate an increase in multimodal person identification performance by utilizing the proposed multi-level feature abstract representations in our multimodal fusion, rather than using only the features from the last layer of each modality-specific CNNs. We show that our deep multi-modal CNNs with multimodal fusion at several different feature level abstraction can significantly outperform the unimodal representation accuracy. We also demonstrate that the joint optimization of all the modality-specific CNNs excels the score and decision level fusions of independently optimized CNNs. 
    more » « less
  5. Neural networks (NN) has been adopted by brain-computer interfaces (BCI) to encode brain signals acquired using electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS). However, it has been found that NN models are vulnerable to adversarial examples, i.e., corrupted samples with imperceptible noise. Once attacked, it could impact medical diagnosis and patients’ quality of life. While early work focuses on interference using external devices at the time of signal acquisition, recent research shifts to collected signals, features, and learning models under various attack modes (e.g., white-, grey-, and black-box). However, existing work only considers single-modality attacks and ignores the topological relationships among different observations, e.g., samples having strong similarities. Different from previous approaches, we introduce graph neural networks (GNN) to multimodal BCI-based classification and explore its performance and robustness against adversarial attacks. This study will evaluate the robustness of NN models with and without graph knowledge on both single and multimodal data. 
    more » « less