skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 10:00 PM ET on Friday, December 8 until 2:00 AM ET on Saturday, December 9 due to maintenance. We apologize for the inconvenience.

Title: Task Balanced Multimodal Feature Selection to Predict the Progression of Alzheimer’s Disease
The social and financial costs associated with Alzheimer's disease (AD) result in significant burdens on our society. In order to understand the causes of this disease, public-private partnerships such as the Alzheimer's Disease Neuroimaging Initiative (ADNI) release data into the scientific community. These data are organized into various modalities (genetic, brain-imaging, cognitive scores, diagnoses, etc.) for analysis. Many statistical learning approaches used in medical image analysis do not explicitly take advantage of this multimodal data structure. In this work we propose a novel objective function and optimization algorithm that is designed to handle multimodal information for the prediction and analysis of AD. Our approach relies on robust matrix-factorization and row-wise sparsity provided by the ℓ2,1- norm in order to integrate multimodal data provided by the ADNI. These techniques are jointly optimized with a classification task to guide the feature selection in our proposed Task Balanced Multimodal Feature Selection method. Our results, when compared against some widely used machine learning algorithms, show improved balanced accuracies, precision, and Matthew's correlation coefficients for identifying cognitive decline. In addition to the improved prediction performance, our method is able to identify brain and genetic biomarkers that are of interest to the clinical research community. Our experiments validate existing brain biomarkers and single nucleotide polymorphisms located on chromosome 11 and detail novel polymorphisms on chromosome 10 that, to the best of the authors' knowledge, have not previously been reported. We anticipate that our method will be of interest to the greater research community and have released our method's code online.11Code is provided at:  more » « less
Award ID(s):
1849359 1652943 1932482 2029543
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2020 IEEE 20th International Conference on Bioinformatics and Bioengineering (BIBE)
Page Range / eLocation ID:
196 to 203
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Background: Machine learning is a promising tool for biomarker-based diagnosis of Alzheimer’s disease (AD). Performing multimodal feature selection and studying the interaction between biological and clinical AD can help to improve the performance of the diagnosis models. Objective: This study aims to formulate a feature ranking metric based on the mutual information index to assess the relevance and redundancy of regional biomarkers and improve the AD classification accuracy. Methods: From the Alzheimer’s Disease Neuroimaging Initiative (ADNI), 722 participants with three modalities, including florbetapir-PET, flortaucipir-PET, and MRI, were studied. The multivariate mutual information metric was utilized to capture the redundancy and complementarity of the predictors and develop a feature ranking approach. This was followed by evaluating the capability of single-modal and multimodal biomarkers in predicting the cognitive stage. Results: Although amyloid-β deposition is an earlier event in the disease trajectory, tau PET with feature selection yielded a higher early-stage classification F1-score (65.4%) compared to amyloid-β PET (63.3%) and MRI (63.2%). The SVC multimodal scenario with feature selection improved the F1-score to 70.0% and 71.8% for the early and late-stage, respectively. When age and risk factors were included, the scores improved by 2 to 4%. The Amyloid-Tau-Neurodegeneration [AT(N)] framework helped to interpret the classification results for different biomarker categories. Conclusion: The results underscore the utility of a novel feature selection approach to reduce the dimensionality of multimodal datasets and enhance model performance. The AT(N) biomarker framework can help to explore the misclassified cases by revealing the relationship between neuropathological biomarkers and cognition. 
    more » « less
  2. Alzheimer's Disease (AD) is a chronic neurodegenerative disease that severely impacts patients' thinking, memory and behavior. To aid automatic AD diagnoses, many longitudinal learning models have been proposed to predict clinical outcomes and/or disease status, which, though, often fail to consider missing temporal phenotypic records of the patients that can convey valuable information of AD progressions. Another challenge in AD studies is how to integrate heterogeneous genotypic and phenotypic biomarkers to improve diagnosis prediction. To cope with these challenges, in this paper we propose a longitudinal multi-modal method to learn enriched genotypic and phenotypic biomarker representations in the format of fixed-length vectors that can simultaneously capture the baseline neuroimaging measurements of the entire dataset and progressive variations of the varied counts of follow-up measurements over time of every participant from different biomarker sources. The learned global and local projections are aligned by a soft constraint and the structured-sparsity norm is used to uncover the multi-modal structure of heterogeneous biomarker measurements. While the proposed objective is clearly motivated to characterize the progressive information of AD developments, it is a nonsmooth objective that is difficult to efficiently optimize in general. Thus, we derive an efficient iterative algorithm, whose convergence is rigorously guaranteed in mathematics. We have conducted extensive experiments on the Alzheimer's Disease Neuroimaging Initiative (ADNI) data using one genotypic and two phenotypic biomarkers. Empirical results have demonstrated that the learned enriched biomarker representations are more effective in predicting the outcomes of various cognitive assessments. Moreover, our model has successfully identified disease-relevant biomarkers supported by existing medical findings that additionally warrant the correctness of our method from the clinical perspective. 
    more » « less
  3. With rapid progress in high-throughput genotyping and neuroimaging, researches of complex brain disorders, such as Alzheimer’s Disease (AD), have gained significant attention in recent years. Many prediction models have been studied to relate neuroimaging measures to cognitive status over the progressions when these disease develops. Missing data is one of the biggest challenge in accurate cognitive score prediction of subjects in longitudinal neuroimaging studies. To tackle this problem, in this paper we propose a novel formulation to learn an enriched representation for imaging biomarkers that can simultaneously capture both the information conveyed by baseline neuroimaging records and that by progressive variations of varied counts of available follow-up records over time. While the numbers of the brain scans of the participants vary, the learned biomarker representation for every participant is a fixed-length vector, which enable us to use traditional learning models to study AD developments. Our new objective is formulated to maximize the ratio of the summations of a number of L1-norm distances for improved robustness, which, though, is difficult to efficiently solve in general. Thus we derive a new efficient iterative solution algorithm and rigorously prove its convergence. We have performed extensive experiments on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset. A performance gain has been achieved to predict four different cognitive scores, when we compare the original baseline representations against the learned representations with enrichments. These promising empirical results have demonstrated improved performances of our new method that validate its effectiveness. 
    more » « less
  4. Abstract

    In this article, we focus on estimating the joint relationship between structural magnetic resonance imaging (sMRI) gray matter (GM), and multiple functional MRI (fMRI) intrinsic connectivity networks (ICNs). To achieve this, we propose a multilink joint independent component analysis (ml‐jICA) method using the same core algorithm as jICA. To relax the jICA assumption, we propose another extension called parallel multilink jICA (pml‐jICA) that allows for a more balanced weight distribution over ml‐jICA/jICA. We assume a shared mixing matrix for both the sMRI and fMRI modalities, while allowing for different mixing matrices linking the sMRI data to the different ICNs. We introduce the model and then apply this approach to study the differences in resting fMRI and sMRI data from patients with Alzheimer's disease (AD) versus controls. The results of the pml‐jICA yield significant differences with large effect sizes that include regions in overlapping portions of default mode network, and also hippocampus and thalamus. Importantly, we identify two joint components with partially overlapping regions which show opposite effects for AD versus controls, but were able to be separated due to being linked to distinct functional and structural patterns. This highlights the unique strength of our approach and multimodal fusion approaches generally in revealing potentially biomarkers of brain disorders that would likely be missed by a unimodal approach. These results represent the first work linking multiple fMRI ICNs to GM components within a multimodal data fusion model and challenges the typical view that brain structure is more sensitive to AD than fMRI.

    more » « less
  5. null (Ed.)
    Alzheimer’s Disease (AD) is a chronic neurodegenerative disease that causes severe problems in patients’ thinking, memory, and behavior. An early diagnosis is crucial to prevent AD progression; to this end, many algorithmic approaches have recently been proposed to predict cognitive decline. However, these predictive models often fail to integrate heterogeneous genetic and neuroimaging biomarkers and struggle to handle missing data. In this work we propose a novel objective function and an associated optimization algorithm to identify cognitive decline related to AD. Our approach is designed to incorporate dynamic neuroimaging data by way of a participant-specific augmentation combined with multimodal data integration aligned via a regression task. Our approach, in order to incorporate additional side-information, utilizes structured regularization techniques popularized in recent AD literature. Armed with the fixed-length vector representation learned from the multimodal dynamic and static modalities, conventional machine learning methods can be used to predict the clinical outcomes associated with AD. Our experimental results show that the proposed augmentation model improves the prediction performance on cognitive assessment scores for a collection of popular machine learning algorithms. The results of our approach are interpreted to validate existing genetic and neuroimaging biomarkers that have been shown to be predictive of cognitive decline. 
    more » « less