skip to main content


Title: Predicting Age-Related Macular Degeneration Progression with Contrastive Attention and Time-Aware LSTM
Age-related macular degeneration (AMD) is the leading cause of irreversible blindness in developed countries. Identifying patients at high risk of progression to late AMD, the sight-threatening stage, is critical for clinical actions, including medical interventions and timely monitoring. Recently, deep-learning-based models have been developed and achieved superior performance for late AMD pre- diction. However, most existing methods are limited to the color fundus photography (CFP) from the last ophthalmic visit and do not include the longitudinal CFP history and AMD progression during the previous years’ visits. Patients in different AMD subphenotypes might have various speeds of progression in different stages of AMD disease. Capturing the progression information during the previous years’ visits might be useful for the prediction of AMD pro- gression. In this work, we propose a Contrastive-Attention-based Time-aware Long Short-Term Memory network (CAT-LSTM) to predict AMD progression. First, we adopt a convolutional neural network (CNN) model with a contrastive attention module (CA) to extract abnormal features from CFPs. Then we utilize a time-aware LSTM (T-LSTM) to model the patients’ history and consider the AMD progression information. The combination of disease pro- gression, genotype information, demographics, and CFP features are sent to T-LSTM. Moreover, we leverage an auto-encoder to represent temporal CFP sequences as fixed-size vectors and adopt k-means to cluster them into subphenotypes. We evaluate the pro- posed model based on real-world datasets, and the results show that the proposed model could achieve 0.925 on area under the receiver operating characteristic (AUROC) for 5-year late-AMD prediction and outperforms the state-of-the-art methods by more than 3%, which demonstrates the effectiveness of the proposed CAT-LSTM. After analyzing patient representation learned by an auto-encoder, we identify 3 novel subphenotypes of AMD patients with different characteristics and progression rates to late AMD, paving the way for improved personalization of AMD management. The code of CAT-LSTM can be found at GitHub .  more » « less
Award ID(s):
2037398
PAR ID:
10403017
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
Page Range / eLocation ID:
4402 to 4412
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Age-related macular degeneration (AMD) is the principal cause of blindness in developed countries, and its prevalence will increase to 288 million people in 2040. Therefore, automated grading and prediction methods can be highly beneficial for recognizing susceptible subjects to late-AMD and enabling clinicians to start preventive actions for them. Clinically, AMD severity is quantified by Color Fundus Photographs (CFP) of the retina, and many machine-learning-based methods are proposed for grading AMD severity. However, few models were developed to predict the longitudinal progression status, i.e. predicting future late-AMD risk based on the current CFP, which is more clinically interesting. In this paper, we propose a new deep-learning-based classification model (LONGL-Net) that can simultaneously grade the current CFP and predict the longitudinal outcome, i.e. whether the subject will be in late-AMD in the future time-point. We design a new temporal-correlation-structure-guided Generative Adversarial Network model that learns the interrelations of temporal changes in CFPs in consecutive time-points and provides interpretability for the classifier's decisions by forecasting AMD symptoms in the future CFPs. We used about 30,000 CFP images from 4,628 participants in the Age-Related Eye Disease Study. Our classifier showed average 0.905 (95% CI: 0.886–0.922) AUC and 0.762 (95% CI: 0.733–0.792) accuracy on the 3-class classification problem of simultaneously grading current time-point's AMD condition and predicting late AMD progression of subjects in the future time-point. We further validated our model on the UK Biobank dataset, where our model showed average 0.905 accuracy and 0.797 sensitivity in grading 300 CFP images.

     
    more » « less
  2. Patient-Reported Outcomes (PRO) are collected directly from the patients using symptom questionnaires. In the case of head and neck cancer patients, PRO surveys are recorded every week during treatment with each patient’s visit to the clinic and at different follow-up times after the treatment has concluded. PRO surveys can be very informative regarding the patient’s status and the effect of treatment on the patient’s quality of life (QoL). Processing PRO data is challenging for several reasons. First, missing data is frequent as patients might skip a question or a questionnaire altogether. Second, PROs are patient-dependent, a rating of 5 for one patient might be a rating of 10 for another patient. Finally, most patients experience severe symptoms during treatment which usually subside over time. However, for some patients, late toxicities persist negatively affecting the patient’s QoL. These long-term severe symptoms are hard to predict and are the focus of this study. In this work, we model PRO data collected from head and neck cancer patients treated at the MD Anderson Cancer Center using the MD Anderson Symptom Inventory (MDASI) questionnaire as time series. We impute missing values with a combination of K nearest neighbor (KNN) and Long Short-Term Memory (LSTM) neural networks, and finally, apply LSTM to predict late symptom severity 12 months after treatment. We compare performance against clinical and ARIMA models. We show that the LSTM model combined with KNN imputation is effective in predicting late-stage symptom ratings for occurrence and severity under the AUC and F1 score metrics. 
    more » « less
  3. Modeling patient disease progression using Electronic Health Records (EHRs) is critical to assist clinical decision making. Long-Short Term Memory (LSTM) is an effective model to handle sequential data, such as EHRs, but it encounters two major limitations when applied to EHRs: it is unable to interpret the prediction results and it ignores the irregular time intervals between consecutive events. To tackle these limitations, we propose an attention-based time-aware LSTM Networks (ATTAIN), to improve the interpretability of LSTM and to identify the critical previous events for current diagnosis by modeling the inherent time irregularity. We validate ATTAIN on modeling the progression of an extremely challenging disease, septic shock, by using real-world EHRs. Our results demonstrate that the proposed framework outperforms the state-of-the-art models such as RETAIN and T-LSTM. Also, the generated interpretative time-aware attention weights shed some lights on the progression behaviors of septic shock. 
    more » « less
  4. Deep neural network models, especially Long Short Term Memory (LSTM), have shown great success in analyzing Electronic Health Records (EHRs) due to their ability to capture temporal dependencies in time series data. When applying the deep learning models to EHRs, we are generally confronted with two major challenges: high rate of missingness and time irregularity. Motivated by the original PACIFIER framework which utilized matrix decomposition for data imputation, we applied and further extended it by including three components: forecasting future events, a time-aware mechanism, and a subgroup basis approach. We evaluated the proposed framework with real-world EHRs which consists of 52,919 visits and 4,224,567 events on a task of early prediction of septic shock. We compared our work against multiple baselines including the original PACIFIER using both LSTM and Time-aware LSTM (T-LSTM). Experimental results showed that our proposed framework significantly outperformed all competitive baseline approaches. More importantly, the extracted interpretative latent patterns from subgroups could shed some lights for clinicians to discover the progression of septic shock patients. 
    more » « less
  5. null (Ed.)
    Abstract COVID-19-associated respiratory failure offers the unprecedented opportunity to evaluate the differential host response to a uniform pathogenic insult. Understanding whether there are distinct subphenotypes of severe COVID-19 may offer insight into its pathophysiology. Sequential Organ Failure Assessment (SOFA) score is an objective and comprehensive measurement that measures dysfunction severity of six organ systems, i.e., cardiovascular, central nervous system, coagulation, liver, renal, and respiration. Our aim was to identify and characterize distinct subphenotypes of COVID-19 critical illness defined by the post-intubation trajectory of SOFA score. Intubated COVID-19 patients at two hospitals in New York city were leveraged as development and validation cohorts. Patients were grouped into mild, intermediate, and severe strata by their baseline post-intubation SOFA. Hierarchical agglomerative clustering was performed within each stratum to detect subphenotypes based on similarities amongst SOFA score trajectories evaluated by Dynamic Time Warping. Distinct worsening and recovering subphenotypes were identified within each stratum, which had distinct 7-day post-intubation SOFA progression trends. Patients in the worsening suphenotypes had a higher mortality than those in the recovering subphenotypes within each stratum (mild stratum, 29.7% vs. 10.3%, p = 0.033; intermediate stratum, 29.3% vs. 8.0%, p = 0.002; severe stratum, 53.7% vs. 22.2%, p < 0.001). Pathophysiologic biomarkers associated with progression were distinct at each stratum, including findings suggestive of inflammation in low baseline severity of illness versus hemophagocytic lymphohistiocytosis in higher baseline severity of illness. The findings suggest that there are clear worsening and recovering subphenotypes of COVID-19 respiratory failure after intubation, which are more predictive of outcomes than baseline severity of illness. Distinct progression biomarkers at differential baseline severity of illness suggests a heterogeneous pathobiology in the progression of COVID-19 respiratory failure. 
    more » « less