skip to main content


Title: A new representation of disease conditions and treatment pathways accurately predicts mortality and chronic diseases
In this study, we introduce a novel representation of patient data called Disease Severity Hierarchy (DSH) that explores specific diseases and their known treatment pathways in a nested fashion to create subpopulations in a clinically meaningful way. As the DSH tree is traversed from the root towards the leaves, we encounter subpopulations that share increasing richer amounts of clinical details such as similar disease severity, illness trajectories, and time to event that are discriminative, and suitable for learning risk stratification models. The proposed DSH risk scores effectively and accurately predict the age at which a patient may be at risk of dying or developing MCE significantly better than a traditional representation of disease conditions. DSH utilizes known relationships among various entities in EHR data to capture disease severity in a natural way and has the additional benefit of being expressive and interpretable. This novel patient representation can help support critical decision making, development of smart EBP guidelines, and enhance healthcare care and disease management by helping to identify and reduce disease burden among high-risk patients.  more » « less
Award ID(s):
1602394
NSF-PAR ID:
10168838
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
AMIA 2019 Annual Symposium
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Sickle cell disease (SCD) is the most prevalent inherited blood disorder in the world. But the clinical manifestations of the disease are highly variable. In particular, it is currently difficult to predict the adverse outcomes within patients with SCD, such as, vasculopathy, thrombosis, and stroke. Therefore, for most effective and timely interventions, a predictive analytic strategy is desirable. In this study, we evaluate the endothelial and prothrombotic characteristics of blood outgrowth endothelial cells (BOECs) generated from blood samples of SCD patients with known differences in clinical severity of the disease. We present a method to evaluate patient‐specific vaso‐occlusive risk by combining novel RNA‐seq and organ‐on‐chip approaches. Through differential gene expression (DGE) and pathway analysis we find that BOECs from SCD patients exhibit an activated state through cell adhesion molecule (CAM) and cytokine signaling pathways among many others. In agreement with clinical symptoms of patients, DGE analyses reveal that patient with severe SCD had a greater extent of endothelial activation compared to patient with milder symptoms. This difference is confirmed by performing qRT‐PCR of endothelial adhesion markers like E‐selectin, P‐selectin, tissue factor, and Von Willebrand factor. Finally, the differential regulation of the proinflammatory phenotype is confirmed through platelet adhesion readouts in our BOEC vessel‐chip. Taken together, we hypothesize that these easily blood‐derived endothelial cells evaluated through RNA‐seq and organ‐on‐chips may serve as a biotechnique to predict vaso‐occlusive episodes in SCD patients and will ultimately allow better therapeutic interventions.

     
    more » « less
  2. Through the COVID-19 pandemic, SARS-CoV-2 has gained and lost multiple mutations in novel or unexpected combinations. Predicting how complex mutations affect COVID-19 disease severity is critical in planning public health responses as the virus continues to evolve. This paper presents a novel computational framework to complement conventional lineage classification and applies it to predict the severe disease potential of viral genetic variation. The transformer-based neural network model architecture has additional layers that provide sample embeddings and sequence-wide attention for interpretation and visualization. First, training a model to predict SARS-CoV-2 taxonomy validates the architecture’s interpretability. Second, an interpretable predictive model of disease severity is trained on spike protein sequence and patient metadata from GISAID. Confounding effects of changing patient demographics, increasing vaccination rates, and improving treatment over time are addressed by including demographics and case date as independent input to the neural network model. The resulting model can be interpreted to identify potentially significant virus mutations and proves to be a robust predctive tool. Although trained on sequence data obtained entirely before the availability of empirical data for Omicron, the model can predict the Omicron’s reduced risk of severe disease, in accord with epidemiological and experimental data. 
    more » « less
  3. null (Ed.)
    Purpose: To develop and evaluate a deep learning (DL) approach to extract rich information from high-resolution computed tomography (HRCT) of patients with chronic obstructive pulmonary disease (COPD). Methods: We develop a DL-based model to learn a compact representation of a subject, which is predictive of COPD physiologic severity and other outcomes. Our DL model learned: (a) to extract informative regional image features from HRCT; (b) to adaptively weight these features and form an aggregate patient representation; and finally, (c) to predict several COPD outcomes. The adaptive weights correspond to the regional lung contribution to the disease. We evaluate the model on 10 300 participants from the COPDGene cohort. Results: Our model was strongly predictive of spirometric obstruction ( r2 = 0.67) and grouped 65.4% of subjects correctly and 89.1% within one stage of their GOLD severity stage. Our model achieved an accuracy of 41.7% and 52.8% in stratifying the population-based on centrilobular (5-grade) and paraseptal (3-grade) emphysema severity score, respectively. For predicting future exacerbation, combining subjects' representations from our model with their past exacerbation histories achieved an accuracy of 80.8% (area under the ROC curve of 0.73). For all-cause mortality, in Cox regression analysis, we outperformed the BODE index improving the concordance metric (ours: 0.61 vs BODE: 0.56). Conclusions: Our model independently predicted spirometric obstruction, emphysema severity, exacerbation risk, and mortality from CT imaging alone. This method has potential applicability in both research and clinical practice. 
    more » « less
  4. Age-related macular degeneration (AMD) is the leading cause of irreversible blindness in developed countries. Identifying patients at high risk of progression to late AMD, the sight-threatening stage, is critical for clinical actions, including medical interventions and timely monitoring. Recently, deep-learning-based models have been developed and achieved superior performance for late AMD pre- diction. However, most existing methods are limited to the color fundus photography (CFP) from the last ophthalmic visit and do not include the longitudinal CFP history and AMD progression during the previous years’ visits. Patients in different AMD subphenotypes might have various speeds of progression in different stages of AMD disease. Capturing the progression information during the previous years’ visits might be useful for the prediction of AMD pro- gression. In this work, we propose a Contrastive-Attention-based Time-aware Long Short-Term Memory network (CAT-LSTM) to predict AMD progression. First, we adopt a convolutional neural network (CNN) model with a contrastive attention module (CA) to extract abnormal features from CFPs. Then we utilize a time-aware LSTM (T-LSTM) to model the patients’ history and consider the AMD progression information. The combination of disease pro- gression, genotype information, demographics, and CFP features are sent to T-LSTM. Moreover, we leverage an auto-encoder to represent temporal CFP sequences as fixed-size vectors and adopt k-means to cluster them into subphenotypes. We evaluate the pro- posed model based on real-world datasets, and the results show that the proposed model could achieve 0.925 on area under the receiver operating characteristic (AUROC) for 5-year late-AMD prediction and outperforms the state-of-the-art methods by more than 3%, which demonstrates the effectiveness of the proposed CAT-LSTM. After analyzing patient representation learned by an auto-encoder, we identify 3 novel subphenotypes of AMD patients with different characteristics and progression rates to late AMD, paving the way for improved personalization of AMD management. The code of CAT-LSTM can be found at GitHub . 
    more » « less
  5. Accurate prediction and monitoring of patient health in the intensive care unit can inform shared decisions regarding appropriateness of care delivery, risk-reduction strategies, and intensive care resource use. Traditionally, algorithmic solutions for patient outcome prediction rely solely on data available from electronic health records (EHR). In this pilot study, we explore the benefits of augmenting existing EHR data with novel measurements from wrist-worn activity sensors as part of a clinical environment known as the Intelligent ICU. We implemented temporal deep learning models based on two distinct sources of patient data: (1) routinely measured vital signs from electronic health records, and (2) activity data collected from wearable sensors. As a proxy for illness severity, our models predicted whether patients leaving the intensive care unit would be successfully or unsuccessfully discharged from the hospital. We overcome the challenge of small sample size in our prospective cohort by applying deep transfer learning using EHR data from a much larger cohort of traditional ICU patients. Our experiments quantify added utility of non-traditional measurements for predicting patient health, especially when applying a transfer learning procedure to small novel Intelligent ICU cohorts of critically ill patients. 
    more » « less