skip to main content


Title: Integrating Static and Dynamic Data for Improved Prediction of Cognitive Declines Using Augmented Genotype-Phenotype Representations
Alzheimer’s Disease (AD) is a chronic neurodegenerative disease that causes severe problems in patients’ thinking, memory, and behavior. An early diagnosis is crucial to prevent AD progression; to this end, many algorithmic approaches have recently been proposed to predict cognitive decline. However, these predictive models often fail to integrate heterogeneous genetic and neuroimaging biomarkers and struggle to handle missing data. In this work we propose a novel objective function and an associated optimization algorithm to identify cognitive decline related to AD. Our approach is designed to incorporate dynamic neuroimaging data by way of a participant-specific augmentation combined with multimodal data integration aligned via a regression task. Our approach, in order to incorporate additional side-information, utilizes structured regularization techniques popularized in recent AD literature. Armed with the fixed-length vector representation learned from the multimodal dynamic and static modalities, conventional machine learning methods can be used to predict the clinical outcomes associated with AD. Our experimental results show that the proposed augmentation model improves the prediction performance on cognitive assessment scores for a collection of popular machine learning algorithms. The results of our approach are interpreted to validate existing genetic and neuroimaging biomarkers that have been shown to be predictive of cognitive decline.  more » « less
Award ID(s):
1652943 1849359 1932482 2029543
NSF-PAR ID:
10294506
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the AAAI Conference on Artificial Intelligence
Volume:
1
ISSN:
2374-3468
Page Range / eLocation ID:
522-530
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The social and financial costs associated with Alzheimer's disease (AD) result in significant burdens on our society. In order to understand the causes of this disease, public-private partnerships such as the Alzheimer's Disease Neuroimaging Initiative (ADNI) release data into the scientific community. These data are organized into various modalities (genetic, brain-imaging, cognitive scores, diagnoses, etc.) for analysis. Many statistical learning approaches used in medical image analysis do not explicitly take advantage of this multimodal data structure. In this work we propose a novel objective function and optimization algorithm that is designed to handle multimodal information for the prediction and analysis of AD. Our approach relies on robust matrix-factorization and row-wise sparsity provided by the ℓ2,1- norm in order to integrate multimodal data provided by the ADNI. These techniques are jointly optimized with a classification task to guide the feature selection in our proposed Task Balanced Multimodal Feature Selection method. Our results, when compared against some widely used machine learning algorithms, show improved balanced accuracies, precision, and Matthew's correlation coefficients for identifying cognitive decline. In addition to the improved prediction performance, our method is able to identify brain and genetic biomarkers that are of interest to the clinical research community. Our experiments validate existing brain biomarkers and single nucleotide polymorphisms located on chromosome 11 and detail novel polymorphisms on chromosome 10 that, to the best of the authors' knowledge, have not previously been reported. We anticipate that our method will be of interest to the greater research community and have released our method's code online.11Code is provided at: https://github.com/minds-mines/TBMFSjl 
    more » « less
  2. Alzheimer's disease (AD) is a serious neurodegenerative condition that affects millions of people across the world. Recently machine learning models have been used to predict the progression of AD, although they frequently do not take advantage of the longitudinal and structural components associated with multi-modal medical data. To address this, we present a new algorithm that uses the multi-block alternating direction method of multipliers to optimize a novel objective that combines multi-modal longitudinal clinical data of various modalities to simultaneously predict the cognitive scores and diagnoses of the participants in the Alzheimer's Disease Neuroimaging Initiative cohort. Our new model is designed to leverage the structure associated with clinical data that is not incorporated into standard machine learning optimization algorithms. This new approach shows state-of-the-art predictive performance and validates a collection of brain and genetic biomarkers that have been recorded previously in AD literature. 
    more » « less
  3. null (Ed.)
    Alzheimer's Disease (AD) is a progressive memory disorder that causes irreversible cognitive decline. Recently, many statistical learning methods have been presented to predict cognitive declines by using longitudinal imaging data. However, missing records that broadly exist in the longitudinal neuroimaging data have posed a critical challenge for effectively using these data in machine learning models. To tackle this difficulty, in this paper we propose a novel approach to integrate longitudinal (dynamic) phenotypic data and static genetic data to learn a fixed-length biomarker representation using the enrichment learned from the temporal data in multiple imaging modalities. Armed with this enriched biomarker representation, as a fixed-length vector per participant, conventional machine learning models can be used to predict clinical outcomes associated with AD. We have applied our new method on the Alzheimer's Disease Neruoimaging Initiative (ADNI) cohort and achieved promising experimental results that validate its effectiveness. 
    more » « less
  4. Abstract

    In the Alzheimer’s disease (AD) continuum, the prodromal state of mild cognitive impairment (MCI) precedes AD dementia and identifying MCI individuals at risk of progression is important for clinical management. Our goal was to develop generalizable multivariate models that integrate high-dimensional data (multimodal neuroimaging and cerebrospinal fluid biomarkers, genetic factors, and measures of cognitive resilience) for identification of MCI individuals who progress to AD within 3 years. Our main findings were i) we were able to build generalizable models with clinically relevant accuracy (~93%) for identifying MCI individuals who progress to AD within 3 years; ii) markers of AD pathophysiology (amyloid, tau, neuronal injury) accounted for large shares of the variance in predicting progression; iii) our methodology allowed us to discover that expression ofCR1(complement receptor 1), an AD susceptibility gene involved in immune pathways, uniquely added independent predictive value. This work highlights the value of optimized machine learning approaches for analyzing multimodal patient information for making predictive assessments.

     
    more » « less
  5. Alzheimer's Disease (AD) is a chronic neurodegenerative disease that severely impacts patients' thinking, memory and behavior. To aid automatic AD diagnoses, many longitudinal learning models have been proposed to predict clinical outcomes and/or disease status, which, though, often fail to consider missing temporal phenotypic records of the patients that can convey valuable information of AD progressions. Another challenge in AD studies is how to integrate heterogeneous genotypic and phenotypic biomarkers to improve diagnosis prediction. To cope with these challenges, in this paper we propose a longitudinal multi-modal method to learn enriched genotypic and phenotypic biomarker representations in the format of fixed-length vectors that can simultaneously capture the baseline neuroimaging measurements of the entire dataset and progressive variations of the varied counts of follow-up measurements over time of every participant from different biomarker sources. The learned global and local projections are aligned by a soft constraint and the structured-sparsity norm is used to uncover the multi-modal structure of heterogeneous biomarker measurements. While the proposed objective is clearly motivated to characterize the progressive information of AD developments, it is a nonsmooth objective that is difficult to efficiently optimize in general. Thus, we derive an efficient iterative algorithm, whose convergence is rigorously guaranteed in mathematics. We have conducted extensive experiments on the Alzheimer's Disease Neuroimaging Initiative (ADNI) data using one genotypic and two phenotypic biomarkers. Empirical results have demonstrated that the learned enriched biomarker representations are more effective in predicting the outcomes of various cognitive assessments. Moreover, our model has successfully identified disease-relevant biomarkers supported by existing medical findings that additionally warrant the correctness of our method from the clinical perspective. 
    more » « less