skip to main content


Title: A Biologically Interpretable Graph Convolutional Network to Link Genetic Risk Pathways and Imaging Phenotypes of Disease
We propose a novel end-to-end framework for whole-brain and whole-genome imaging-genetics. Our genetics network uses hierarchical graph convolution and pooling operations to embed subject-level data onto a low-dimensional latent space. The hierarchical network implicitly tracks the convergence of genetic risk across well-established biological pathways, while an attention mechanism automatically identifies the salient edges of this network at the subject level. In parallel, our imaging network projects multimodal data onto a set of latent embeddings. For interpretability, we implement a Bayesian feature selection strategy to extract the discriminative imaging biomarkers; these feature weights are optimized alongside the other model parameters. We couple the imaging and genetic embeddings with a predictor network, to ensure that the learned representations are linked to phenotype. We evaluate our framework on a schizophrenia dataset that includes two functional MRI paradigms and gene scores derived from Single Nucleotide Polymorphism data. Using repeated 10-fold cross-validation, we show that our imaging-genetics fusion achieves the better classification performance than state-of-the-art baselines. In an exploratory analysis, we further show that the biomarkers identified by our model are reproducible and closely associated with deficits in schizophrenia.  more » « less
Award ID(s):
1822575
NSF-PAR ID:
10379197
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
International Conference on Learning Representations
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We propose a joint dictionary learning framework that couples imaging and genetics data in a low dimensional subspace as guided by clinical diagnosis. We use a graph regularization penalty to simultaneously capture inter-regional brain interactions and identify the representative set anatomical basis vectors that span the low dimensional space. We further employ group sparsity to find the representative set of genetic basis vectors that span the same latent space. Finally, the latent projection is used to classify patients versus controls. We have evaluated our model on two task fMRI paradigms and single nucleotide polymorphism (SNP) data from schizophrenic patients and matched neurotypical controls. We employ a ten fold cross validation technique to show the predictive power of our model. We compare our model with canonical correlation analysis of imaging and genetics data and random forest classification. Our approach shows better prediction accuracy on both task datasets. Moreover, the implicated brain regions and genetic variants underlie the well documented deficits in schizophrenia. 
    more » « less
  2. Abstract

    Functional magnetic resonance imaging (fMRI) studies have shown altered brain dynamic functional connectivity (DFC) in mental disorders. Here, we aim to explore DFC across a spectrum of symptomatically‐related disorders including bipolar disorder with psychosis (BPP), schizoaffective disorder (SAD), and schizophrenia (SZ). We introduce a group information guided independent component analysis procedure to estimate both group‐level and subject‐specific connectivity states from DFC. Using resting‐state fMRI data of 238 healthy controls (HCs), 140 BPP, 132 SAD, and 113 SZ patients, we identified measures differentiating groups from the whole‐brain DFC and traditional static functional connectivity (SFC), separately. Results show that DFC provided more informative measures than SFC. Diagnosis‐related connectivity states were evident using DFC analysis. For the dominant state consistent across groups, we found 22 instances of hypoconnectivity (with decreasing trends from HC to BPP to SAD to SZ) mainly involving post‐central, frontal, and cerebellar cortices as well as 34 examples of hyperconnectivity (with increasing trends HC through SZ) primarily involving thalamus and temporal cortices. Hypoconnectivities/hyperconnectivities also showed negative/positive correlations, respectively, with clinical symptom scores. Specifically, hypoconnectivities linking postcentral and frontal gyri were significantly negatively correlated with the PANSS positive/negative scores. For frontal connectivities, BPP resembled HC while SAD and SZ were more similar. Three connectivities involving the left cerebellar crus differentiated SZ from other groups and one connection linking frontal and fusiform cortices showed a SAD‐unique change. In summary, our method is promising for assessing DFC and may yield imaging biomarkers for quantifying the dimension of psychosis.Hum Brain Mapp 38:2683–2708, 2017. ©2017 Wiley Periodicals, Inc.

     
    more » « less
  3. Abstract Background

    In Alzheimer’s Diseases (AD) research, multimodal imaging analysis can unveil complementary information from multiple imaging modalities and further our understanding of the disease. One application is to discover disease subtypes using unsupervised clustering. However, existing clustering methods are often applied to input features directly, and could suffer from the curse of dimensionality with high-dimensional multimodal data. The purpose of our study is to identify multimodal imaging-driven subtypes in Mild Cognitive Impairment (MCI) participants using a multiview learning framework based on Deep Generalized Canonical Correlation Analysis (DGCCA), to learn shared latent representation with low dimensions from 3 neuroimaging modalities.

    Results

    DGCCA applies non-linear transformation to input views using neural networks and is able to learn correlated embeddings with low dimensions that capture more variance than its linear counterpart, generalized CCA (GCCA). We designed experiments to compare DGCCA embeddings with single modality features and GCCA embeddings by generating 2 subtypes from each feature set using unsupervised clustering. In our validation studies, we found that amyloid PET imaging has the most discriminative features compared with structural MRI and FDG PET which DGCCA learns from but not GCCA. DGCCA subtypes show differential measures in 5 cognitive assessments, 6 brain volume measures, and conversion to AD patterns. In addition, DGCCA MCI subtypes confirmed AD genetic markers with strong signals that existing late MCI group did not identify.

    Conclusion

    Overall, DGCCA is able to learn effective low dimensional embeddings from multimodal data by learning non-linear projections. MCI subtypes generated from DGCCA embeddings are different from existing early and late MCI groups and show most similarity with those identified by amyloid PET features. In our validation studies, DGCCA subtypes show distinct patterns in cognitive measures, brain volumes, and are able to identify AD genetic markers. These findings indicate the promise of the imaging-driven subtypes and their power in revealing disease structures beyond early and late stage MCI.

     
    more » « less
  4. Abstract

    There is growing evidence that rather than using a single brain imaging modality to study its association with physiological or symptomatic features, the field is paying more attention to fusion of multimodal information. However, most current multimodal fusion approaches that incorporate functional magnetic resonance imaging (fMRI) are restricted to second‐level 3D features, rather than the original 4D fMRI data. This trade‐off is that the valuable temporal information is not utilized during the fusion step. Here we are motivated to propose a novel approach called “parallel group ICA+ICA” that incorporates temporal fMRI information from group independent component analysis (GICA) into a parallel independent component analysis (ICA) framework, aiming to enable direct fusion of first‐level fMRI features with other modalities (e.g., structural MRI), which thus can detect linked functional network variability and structural covariations. Simulation results show that the proposed method yields accurate intermodality linkage detection regardless of whether it is strong or weak. When applied to real data, we identified one pair of significantly associated fMRI‐sMRI components that show group difference between schizophrenia and controls in both modalities, and this linkage can be replicated in an independent cohort. Finally, multiple cognitive domain scores can be predicted by the features identified in the linked component pair by our proposed method. We also show these multimodal brain features can predict multiple cognitive scores in an independent cohort. Overall, results demonstrate the ability of parallel GICA+ICA to estimate joint information from 4D and 3D data without discarding much of the available information up front, and the potential for using this approach to identify imaging biomarkers to study brain disorders.

     
    more » « less
  5. Abstract

    Resting‐state functional network connectivity (rsFNC) has shown utility for identifying characteristic functional brain patterns in individuals with psychiatric and mood disorders, providing a promising avenue for biomarker development. However, several factors have precluded widespread clinical adoption of rsFNC diagnostics, namely a lack of standardized approaches for capturing comparable and reproducible imaging markers across individuals, as well as the disagreement on the amount of data required to robustly detect intrinsic connectivity networks (ICNs) and diagnostically relevant patterns of rsFNC at the individual subject level. Recently, spatially constrained independent component analysis (scICA) has been proposed as an automated method for extracting ICNs standardized to a chosen network template while still preserving individual variation. Leveraging the scICA methodology, which solves the former challenge of standardized neuroimaging markers, we investigate the latter challenge of identifying a minimally sufficient data length for clinical applications of resting‐state fMRI (rsfMRI). Using a dataset containing rsfMRI scans of individuals with schizophrenia and controls (M = 310) as well as simulated rsfMRI, we evaluated the robustness of ICN and rsFNC estimates at both the subject‐ and group‐level, as well as the performance of diagnostic classification, with respect to the length of the rsfMRI time course. We found individual estimates of ICNs and rsFNC from the full‐length (5 min) reference time course were sufficiently approximated with just 3–3.5 min of data (r = 0.85, 0.88, respectively), and significant differences in group‐average rsFNC could be sufficiently approximated with even less data, just 2 min (r = 0.86). These results from the shorter clinical data were largely consistent with the results from validation experiments using longer time series from both simulated (30 min) and real‐world (14 min) datasets, in which estimates of subject‐level FNC were reliably estimated with 3–5 min of data. Moreover, in the real‐world data we found rsFNC and ICN estimates generated across the full range of data lengths (0.5–14 min) more reliably matched those generated from the first 5 min of scan time than those generated from the last 5 min, suggesting increased influence of “late scan” noise factors such as fatigue or drowsiness may limit the reliability of FNC from data collected after 10+ min of scan time, further supporting the notion of shorter scans. Lastly, a diagnostic classification model trained on just 2 min of data retained 97%–98% classification accuracy relative to that of the full‐length reference model. Our results suggest that, when decomposed with scICA, rsfMRI scans of just 2–5 min show good clinical utility without significant loss of individual FNC information of longer scan lengths.

     
    more » « less