skip to main content

Title: Empowering Variational Inference with Predictive Features: Application to Disease Suptypying
Probabilistic topic models, have been widely deployed for various applications such as learning disease or tissue subtypes. Yet, learning the parameters of such models is usually an ill-posed problem and may result in losing valuable information about disease severity. A common approach is to add a discriminative loss term to the generative model’s loss in order to learn a representation that is also predictive of disease severity. However, finding a balance between these two losses is not straightforward. We propose an alternative way in this paper. We develop a framework which allows for incorporating external covariates into the generative model’s approximate posterior. These covariates can have more discriminative power for disease severity compared to the representation that we extract from the posterior distribution. For instance, they can be features extracted from a neural network which predicts disease severity from CT images. Effectively, we enforce the generative model’s approximate posterior to reside in the subspace of these discriminative covariates. We illustrate our method’s application on a large-scale lung CT study of Chronic Obstructive Pulmonary Disease (COPD), a highly heterogeneous disease. We aim at identifying tissue subtypes by using a variant of topic model as a generative model. We quantitatively evaluate the predictive performance of the inferred subtypes and demonstrate that our method outperforms or performs on par with some reasonable baselines. We also show that some of the discovered subtypes are correlated with genetic measurements, suggesting that the identified subtypes may characterize the disease’s underlying etiology.  more » « less
Award ID(s):
Author(s) / Creator(s):
Date Published:
Journal Name:
Proceedings of Machine Learning Research
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this study, we introduce a novel representation of patient data called Disease Severity Hierarchy (DSH) that explores specific diseases and their known treatment pathways in a nested fashion to create subpopulations in a clinically meaningful way. As the DSH tree is traversed from the root towards the leaves, we encounter subpopulations that share increasing richer amounts of clinical details such as similar disease severity, illness trajectories, and time to event that are discriminative, and suitable for learning risk stratification models. The proposed DSH risk scores effectively and accurately predict the age at which a patient may be at risk of dying or developing MCE significantly better than a traditional representation of disease conditions. DSH utilizes known relationships among various entities in EHR data to capture disease severity in a natural way and has the additional benefit of being expressive and interpretable. This novel patient representation can help support critical decision making, development of smart EBP guidelines, and enhance healthcare care and disease management by helping to identify and reduce disease burden among high-risk patients. 
    more » « less
  2. Deep neural networks are susceptible to shortcut learning, using simple features to achieve low training loss without discovering essential semantic structure. Contrary to prior belief, we show that generative models alone are not sufficient to prevent shortcut learning, despite an incentive to recover a more comprehensive representation of the data than discriminative approaches. However, we observe that shortcuts are preferentially encoded with minimal information, a fact that generative models can exploit to mitigate shortcut learning. In particular, we propose Chroma-VAE, a two-pronged approach where a VAE classifier is initially trained to isolate the shortcut in a small latent subspace, allowing a secondary classifier to be trained on the complementary, shortcut-free latent subspace. In addition to demonstrating the efficacy of Chroma-VAE on benchmark and real-world shortcut learning tasks, our work highlights the potential for manipulating the latent space of generative classifiers to isolate or interpret specific correlations. 
    more » « less
  3. null (Ed.)
    Purpose: To develop and evaluate a deep learning (DL) approach to extract rich information from high-resolution computed tomography (HRCT) of patients with chronic obstructive pulmonary disease (COPD). Methods: We develop a DL-based model to learn a compact representation of a subject, which is predictive of COPD physiologic severity and other outcomes. Our DL model learned: (a) to extract informative regional image features from HRCT; (b) to adaptively weight these features and form an aggregate patient representation; and finally, (c) to predict several COPD outcomes. The adaptive weights correspond to the regional lung contribution to the disease. We evaluate the model on 10 300 participants from the COPDGene cohort. Results: Our model was strongly predictive of spirometric obstruction ( r2 = 0.67) and grouped 65.4% of subjects correctly and 89.1% within one stage of their GOLD severity stage. Our model achieved an accuracy of 41.7% and 52.8% in stratifying the population-based on centrilobular (5-grade) and paraseptal (3-grade) emphysema severity score, respectively. For predicting future exacerbation, combining subjects' representations from our model with their past exacerbation histories achieved an accuracy of 80.8% (area under the ROC curve of 0.73). For all-cause mortality, in Cox regression analysis, we outperformed the BODE index improving the concordance metric (ours: 0.61 vs BODE: 0.56). Conclusions: Our model independently predicted spirometric obstruction, emphysema severity, exacerbation risk, and mortality from CT imaging alone. This method has potential applicability in both research and clinical practice. 
    more » « less
  4. Due to the potentially significant benefits for society, forecasting spatio-temporal societal events is currently attracting considerable attention from researchers. Beyond merely predicting the occurrence of future events, practitioners are now looking for information about specific subtypes of future events in order to allocate appropriate amounts and types of resources to manage such events and any associated social risks. However, forecasting event subtypes is far more complex than merely extending binary prediction to cover multiple classes, as 1) different locations require different models to handle their characteristic event subtype patterns due to spatial heterogeneity; 2) historically, many locations have only experienced a incomplete set of event subtypes, thus limiting the local model’s ability to predict previously “unseen” subtypes; and 3) the subtle discrepancy among different event subtypes requires more discriminative and profound representations of societal events. In order to address all these challenges concurrently, we propose a Spatial Incomplete Multi-task Deep leArning (SIMDA) framework that is capable of effectively forecasting the subtypes of future events. The new framework formulates spatial locations into tasks to handle spatial heterogeneity in event subtypes, and learns a joint deep representation of subtypes across tasks. Furthermore, based on the “first law of geography”, spatiallyclosed tasks share similar event subtype patterns such that adjacent tasks can share knowledge with each other effectively. Optimizing the proposed model amounts to a new nonconvex and strongly-coupled problem, we propose a new algorithm based on Alternating Direction Method of Multipliers (ADMM) that can decompose the complex problem into subproblems that can be solved efficiently. Extensive experiments on six real-world datasets demonstrate the effectiveness and efficiency of the proposed model. 
    more » « less
  5. Abstract Background

    In Alzheimer’s Diseases (AD) research, multimodal imaging analysis can unveil complementary information from multiple imaging modalities and further our understanding of the disease. One application is to discover disease subtypes using unsupervised clustering. However, existing clustering methods are often applied to input features directly, and could suffer from the curse of dimensionality with high-dimensional multimodal data. The purpose of our study is to identify multimodal imaging-driven subtypes in Mild Cognitive Impairment (MCI) participants using a multiview learning framework based on Deep Generalized Canonical Correlation Analysis (DGCCA), to learn shared latent representation with low dimensions from 3 neuroimaging modalities.


    DGCCA applies non-linear transformation to input views using neural networks and is able to learn correlated embeddings with low dimensions that capture more variance than its linear counterpart, generalized CCA (GCCA). We designed experiments to compare DGCCA embeddings with single modality features and GCCA embeddings by generating 2 subtypes from each feature set using unsupervised clustering. In our validation studies, we found that amyloid PET imaging has the most discriminative features compared with structural MRI and FDG PET which DGCCA learns from but not GCCA. DGCCA subtypes show differential measures in 5 cognitive assessments, 6 brain volume measures, and conversion to AD patterns. In addition, DGCCA MCI subtypes confirmed AD genetic markers with strong signals that existing late MCI group did not identify.


    Overall, DGCCA is able to learn effective low dimensional embeddings from multimodal data by learning non-linear projections. MCI subtypes generated from DGCCA embeddings are different from existing early and late MCI groups and show most similarity with those identified by amyloid PET features. In our validation studies, DGCCA subtypes show distinct patterns in cognitive measures, brain volumes, and are able to identify AD genetic markers. These findings indicate the promise of the imaging-driven subtypes and their power in revealing disease structures beyond early and late stage MCI.

    more » « less