skip to main content

Title: Deep Markov Factor Analysis: Towards Concurrent Temporal and Spatial Analysis of fMRI Data
Factor analysis methods have been widely used in neuroimaging to transfer high dimensional imaging data into low dimensional, ideally interpretable representations. However, most of these methods overlook the highly nonlinear and complex temporal dynamics of neural processes when factorizing their imaging data. In this paper, we present deep Markov factor analysis (DMFA), a generative model that employs Markov property in a chain of low dimensional temporal embeddings together with spatial inductive assumptions, all related through neural networks, to capture temporal dynamics in functional magnetic resonance imaging (fMRI) data, and tackle their high spatial dimensionality, respectively. Augmented with a discrete latent, DMFA is able to cluster fMRI data in its low dimensional temporal embedding with regard to subject and cognitive state variability, therefore, enables validation of a variety of fMRI-driven neuroscientific hypotheses. Experimental results on both synthetic and real fMRI data demonstrate the capacity of DMFA in revealing interpretable clusters and capturing nonlinear temporal dependencies in these high dimensional imaging data.
Award ID(s):
Publication Date:
Journal Name:
Neural Information Processing Systems (NeurIPS)
Sponsoring Org:
National Science Foundation
More Like this
  1. Understanding the intrinsic patterns of human brain is important to make inferences about the mind and brain-behavior association. Electrophysiological methods (i.e. MEG/EEG) provide direct measures of neural activity without the effect of vascular confounds. The blood oxygenated level-dependent (BOLD) signal of functional MRI (fMRI) reveals the spatial and temporal brain activity across different brain regions. However, it is unclear how to associate the high temporal resolution Electrophysiological measures with high spatial resolution fMRI signals. Here, we present a novel interpretable model for coupling the structure and function activity of brain based on heterogeneous contrastive graph representation. The proposed method is able to link manifest variables of the brain (i.e. MEG, MRI, fMRI and behavior performance) and quantify the intrinsic coupling strength of different modal signals. The proposed method learns the heterogeneous node and graph representations by contrasting the structural and temporal views through the mind to multimodal brain data. The first experiment with 1200 subjects from Human connectome Project (HCP) shows that the proposed method outperforms the existing approaches in predicting individual gender and enabling the location of the importance of brain regions with sex difference. The second experiment associates the structure and temporal views between the low-level sensory regionsmore »and high-level cognitive ones. The experimental results demonstrate that the dependence of structural and temporal views varied spatially through different modal variants. The proposed method enables the heterogeneous biomarkers explanation for different brain measurements.« less
  2. Analysis of time-evolving data is crucial to understand the functioning of dynamic systems such as the brain. For instance, analysis of functional magnetic resonance imaging (fMRI) data collected during a task may reveal spatial regions of interest, and how they evolve during the task. However, capturing underlying spatial patterns as well as their change in time is challenging. The traditional approach in fMRI data analysis is to assume that underlying spatial regions of interest are static. In this article, using fractional amplitude of low-frequency fluctuations (fALFF) as an effective way to summarize the variability in fMRI data collected during a task, we arrange time-evolving fMRI data as a subjects by voxels by time windows tensor, and analyze the tensor using a tensor factorization-based approach called a PARAFAC2 model to reveal spatial dynamics. The PARAFAC2 model jointly analyzes data from multiple time windows revealing subject-mode patterns, evolving spatial regions (also referred to as networks) and temporal patterns. We compare the PARAFAC2 model with matrix factorization-based approaches relying on independent components, namely, joint independent component analysis (ICA) and independent vector analysis (IVA), commonly used in neuroimaging data analysis. We assess the performance of the methods in terms of capturing evolving networks throughmore »extensive numerical experiments demonstrating their modeling assumptions. In particular, we show that (i) PARAFAC2 provides a compact representation in all modes, i.e., subjects, time , and voxels , revealing temporal patterns as well as evolving spatial networks, (ii) joint ICA is as effective as PARAFAC2 in terms of revealing evolving networks but does not reveal temporal patterns, (iii) IVA's performance depends on sample size, data distribution and covariance structure of underlying networks. When these assumptions are satisfied, IVA is as accurate as the other methods, (iv) when subject-mode patterns differ from one time window to another, IVA is the most accurate. Furthermore, we analyze real fMRI data collected during a sensory motor task, and demonstrate that a component indicating statistically significant group difference between patients with schizophrenia and healthy controls is captured, which includes primary and secondary motor regions, cerebellum, and temporal lobe, revealing a meaningful spatial map and its temporal change.« less
  3. Identifying the directed connectivity that underlie networked activity between different cortical areas is critical for understanding the neural mechanisms behind sensory processing. Granger causality (GC) is widely used for this purpose in functional magnetic resonance imaging analysis, but there the temporal resolution is low, making it difficult to capture the millisecond-scale interactions underlying sensory processing. Magne- toencephalography (MEG) has millisecond resolution, but only provides low-dimensional sensor-level linear mixtures of neural sources, which makes GC inference challenging. Conventional methods proceed in two stages: First, cortical sources are estimated from MEG using a source localization technique, followed by GC inference among the estimated sources. However, the spatiotemporal biases in estimating sources propagate into the subsequent GC analysis stage, may result in both false alarms and missing true GC links. Here, we introduce the Network Localized Granger Causality (NLGC) inference paradigm, which models the source dynamics as latent sparse multivariate autoregressive processes and estimates their parameters directly from the MEG measurements, integrated with source localization, and employs the resulting parameter estimates to produce a precise statistical characterization of the detected GC links. We offer several theoretical and algorithmic innovations within NLGC and further examine its utility via comprehensive simulations and application to MEGmore »data from an auditory task involving tone processing from both younger and older participants. Our simulation studies reveal that NLGC is markedly robust with respect to model mismatch, network size, and low signal-to-noise ratio, whereas the conventional two-stage methods result in high false alarms and mis-detections. We also demonstrate the advantages of NLGC in revealing the cortical network- level characterization of neural activity during tone processing and resting state by delineating task- and age-related connectivity changes.« less
  4. Brain large-scale dynamics is constrained by the heterogeneity of intrinsic anatomical substrate. Little is known how the spatiotemporal dynamics adapt for the heterogeneous structural connectivity (SC). Modern neuroimaging modalities make it possible to study the intrinsic brain activity at the scale of seconds to minutes. Diffusion magnetic resonance imaging (dMRI) and functional MRI reveals the large-scale SC across different brain regions. Electrophysiological methods (i.e. MEG/EEG) provide direct measures of neural activity and exhibits complex neurobiological temporal dynamics which could not be solved by fMRI. However, most of existing multimodal analytical methods collapse the brain measurements either in space or time domain and fail to capture the spatio-temporal circuit dynamics. In this paper, we propose a novel spatio-temporal graph Transformer model to integrate the structural and functional connectivity in both spatial and temporal domain. The proposed method learns the heterogeneous node and graph representation via contrastive learning and multi-head attention based graph Transformer using multimodal brain data (i.e. fMRI, MRI, MEG and behavior performance). The proposed contrastive graph Transformer representation model incorporates the heterogeneity map constrained by T1-to-T2-weighted (T1w/T2w) to improve the model fit to structurefunction interactions. The experimental results with multimodal resting state brain measurements demonstrate the proposed method couldmore »highlight the local properties of large-scale brain spatio-temporal dynamics and capture the dependence strength between functional connectivity and behaviors. In summary, the proposed method enables the complex brain dynamics explanation for different modal variants.« less
  5. Recurrent neural networks (RNNs) are a widely used tool for modeling sequential data, yet they are often treated as inscrutable black boxes. Given a trained recurrent network, we would like to reverse engineer it--to obtain a quantitative, interpretable description of how it solves a particular task. Even for simple tasks, a detailed understanding of how recurrent networks work, or a prescription for how to develop such an understanding, remains elusive. In this work, we use tools from dynamical systems analysis to reverse engineer recurrent networks trained to perform sentiment classification, a foundational natural language processing task. Given a trained network, we find fixed points of the recurrent dynamics and linearize the nonlinear system around these fixed points. Despite their theoretical capacity to implement complex, high-dimensional computations, we find that trained networks converge to highly interpretable, low-dimensional representations. In particular, the topological structure of the fixed points and corresponding linearized dynamics reveal an approximate line attractor within the RNN, which we can use to quantitatively understand how the RNN solves the sentiment analysis task. Finally, we find this mechanism present across RNN architectures (including LSTMs, GRUs, and vanilla RNNs) trained on multiple datasets, suggesting that our findings are not unique tomore »a particular architecture or dataset. Overall, these results demonstrate that surprisingly universal and human interpretable computations can arise across a range of recurrent networks.« less