skip to main content

Title: Enhancing heart failure treatment decisions: interpretable machine learning models for advanced therapy eligibility prediction using EHR data

Timely and accurate referral of end-stage heart failure patients for advanced therapies, including heart transplants and mechanical circulatory support, plays an important role in improving patient outcomes and saving costs. However, the decision-making process is complex, nuanced, and time-consuming, requiring cardiologists with specialized expertise and training in heart failure and transplantation.

In this study, we propose two logistic tensor regression-based models to predict patients with heart failure warranting evaluation for advanced heart failure therapies using irregularly spaced sequential electronic health records at the population and individual levels. The clinical features were collected at the previous visit and the predictions were made at the very beginning of the subsequent visit. Patient-wise ten-fold cross-validation experiments were performed. Standard LTR achieved an average F1 score of 0.708, AUC of 0.903, and AUPRC of 0.836. Personalized LTR obtained an F1 score of 0.670, an AUC of 0.869 and an AUPRC of 0.839. The two models not only outperformed all other machine learning models to which they were compared but also improved the performance and robustness of the other models via weight transfer. The AUPRC scores of support vector machine, random forest, and Naive Bayes are improved by 8.87%, 7.24%, and 11.38%, respectively.

The two models can evaluate the importance of clinical features associated with advanced therapy referral. The five most important medical codes, including chronic kidney disease, hypotension, pulmonary heart disease, mitral regurgitation, and atherosclerotic heart disease, were reviewed and validated with literature and by heart failure cardiologists. Our proposed models effectively utilize EHRs for potential advanced therapies necessity in heart failure patients while explaining the importance of comorbidities and other clinical events. The information learned from trained model training could offer further insight into risk factors contributing to the progression of heart failure at both the population and individual levels.

more » « less
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
BMC Medical Informatics and Decision Making
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Background

    Atrial fibrillation (AF) significantly reduces health‐related quality of life (HRQoL), previously measured in clinical trials using patient‐reported outcomes (PROs). We examined AF PROs in clinical practice and their association with subsequent clinical management.


    The Utah My Evaluation (mEVAL) program collects the Toronto AF Symptom Severity Scale (AFSS) in AF outpatients at the University of Utah. Baseline factors associated with worse AF symptom score (range 0–35, higher is worse) were identified in univariate and multivariable analyses. Secondary outcomes included AF burden and AF healthcare utilization. We also compared subsequent clinical management at 6 months between patients with better versus worse AF HRQoL.


    Overall, 1338 patients completed the AFSS symptom score, which varied by sex (mean 7.26 for males vs. 10.27 for females;p < .001), age (<65, 9.73; 65–74, 7.66; ≥75, 7.58;p < .001), heart failure (9.39 with HF vs. 7.67 without;p < .001), and prior ablation (7.28 with prior ablation vs. 8.84;p < .001). In multivariable analysis, younger age (mean difference 2.92 for <65 vs. ≥75;p < .001), female sex (mean difference 2.57;p < .001), pulmonary disease (mean difference 1.88; p < .001), and depression (mean difference 2.46;p < .001) were associated with higher scores. At 6‐months, worse baseline symptom score was associated with the use of rhythm control (37.1% vs. 24.5%;p < .001). Similar cofactors and results were associated with increased AF burden and health care utilization scores.


    AF PROs in clinical practice identify highly‐symptomatic patients, corroborating findings in more controlled, clinical trials. Increased AFSS score correlates with more aggressive clinical management, supporting the utility of disease‐specific PROs guiding clinical practice.

    more » « less
  2. Abstract Background Predictive models utilizing social determinants of health (SDH), demographic data, and local weather data were trained to predict missed imaging appointments (MIA) among breast imaging patients at the Boston Medical Center (BMC). Patients were characterized by many different variables, including social needs, demographics, imaging utilization, appointment features, and weather conditions on the date of the appointment. Methods This HIPAA compliant retrospective cohort study was IRB approved. Informed consent was waived. After data preprocessing steps, the dataset contained 9,970 patients and 36,606 appointments from 1/1/2015 to 12/31/2019. We identified 57 potentially impactful variables used in the initial prediction model and assessed each patient for MIA. We then developed a parsimonious model via recursive feature elimination, which identified the 25 most predictive variables. We utilized linear and non-linear models including support vector machines (SVM), logistic regression (LR), and random forest (RF) to predict MIA and compared their performance. Results The highest-performing full model is the nonlinear RF, achieving the highest Area Under the ROC Curve (AUC) of 76% and average F1 score of 85%. Models limited to the most predictive variables were able to attain AUC and F1 scores comparable to models with all variables included. The variables most predictive of missed appointments included timing, prior appointment history, referral department of origin, and socioeconomic factors such as household income and access to caregiving services. Conclusions Prediction of MIA with the data available is inherently limited by the complex, multifactorial nature of MIA. However, the algorithms presented achieved acceptable performance and demonstrated that socioeconomic factors were useful predictors of MIA. In contrast with non-modifiable demographic factors, we can address SDH to decrease the incidence of MIA. 
    more » « less
  3. Abstract Objective

    To develop predictive models of coronavirus disease 2019 (COVID-19) outcomes, elucidate the influence of socioeconomic factors, and assess algorithmic racial fairness using a racially diverse patient population with high social needs.

    Materials and Methods

    Data included 7,102 patients with positive (RT-PCR) severe acute respiratory syndrome coronavirus 2 test at a safety-net system in Massachusetts. Linear and nonlinear classification methods were applied. A score based on a recurrent neural network and a transformer architecture was developed to capture the dynamic evolution of vital signs. Combined with patient characteristics, clinical variables, and hospital occupancy measures, this dynamic vital score was used to train predictive models.


    Hospitalizations can be predicted with an area under the receiver-operating characteristic curve (AUC) of 92% using symptoms, hospital occupancy, and patient characteristics, including social determinants of health. Parsimonious models to predict intensive care, mechanical ventilation, and mortality that used the most recent labs and vitals exhibited AUCs of 92.7%, 91.2%, and 94%, respectively. Early predictive models, using labs and vital signs closer to admission had AUCs of 81.1%, 84.9%, and 92%, respectively.


    The most accurate models exhibit racial bias, being more likely to falsely predict that Black patients will be hospitalized. Models that are only based on the dynamic vital score exhibited accuracies close to the best parsimonious models, although the latter also used laboratories.


    This large study demonstrates that COVID-19 severity may accurately be predicted using a score that accounts for the dynamic evolution of vital signs. Further, race, social determinants of health, and hospital occupancy play an important role.

    more » « less
  4. Background Heart failure is a leading cause of mortality and morbidity worldwide. Acute heart failure, broadly defined as rapid onset of new or worsening signs and symptoms of heart failure, often requires hospitalization and admission to the intensive care unit (ICU). This acute condition is highly heterogeneous and less well-understood as compared to chronic heart failure. The ICU, through detailed and continuously monitored patient data, provides an opportunity to retrospectively analyze decompensation and heart failure to evaluate physiological states and patient outcomes. Objective The goal of this study is to examine the prevalence of cardiovascular risk factors among those admitted to ICUs and to evaluate combinations of clinical features that are predictive of decompensation events, such as the onset of acute heart failure, using machine learning techniques. To accomplish this objective, we leveraged tele-ICU data from over 200 hospitals across the United States. Methods We evaluated the feasibility of predicting decompensation soon after ICU admission for 26,534 patients admitted without a history of heart failure with specific heart failure risk factors (ie, coronary artery disease, hypertension, and myocardial infarction) and 96,350 patients admitted without risk factors using remotely monitored laboratory, vital signs, and discrete physiological measurements. Multivariate logistic regression and random forest models were applied to predict decompensation and highlight important features from combinations of model inputs from dissimilar data. Results The most prevalent risk factor in our data set was hypertension, although most patients diagnosed with heart failure were admitted to the ICU without a risk factor. The highest heart failure prediction accuracy was 0.951, and the highest area under the receiver operating characteristic curve was 0.9503 with random forest and combined vital signs, laboratory values, and discrete physiological measurements. Random forest feature importance also highlighted combinations of several discrete physiological features and laboratory measures as most indicative of decompensation. Timeline analysis of aggregate vital signs revealed a point of diminishing returns where additional vital signs data did not continue to improve results. Conclusions Heart failure risk factors are common in tele-ICU data, although most patients that are diagnosed with heart failure later in an ICU stay presented without risk factors making a prediction of decompensation critical. Decompensation was predicted with reasonable accuracy using tele-ICU data, and optimal data extraction for time series vital signs data was identified near a 200-minute window size. Overall, results suggest combinations of laboratory measurements and vital signs are viable for early and continuous prediction of patient decompensation. 
    more » « less
  5. Abstract

    Genodermatoses are inherited disorders with skin manifestations and can present with multisystem involvement, resulting in challenges in diagnosis and treatment. To address this, the expertise of dermatology and clinical genetics through a multidisciplinary clinic (Genodermatoses Clinic) were combined. A retrospective cohort study of 45 children seen between March 2018 and February 2019 in the Genodermatoses Clinic at The Children's Hospital of Philadelphia was performed. Patient demographics, referral information, genetic testing modality, diagnoses, and patient satisfaction scores were evaluated to assess the clinic's impact. The majority of patients (42.2%) were referred from Dermatology and 86.7% were referred for diagnosis. Two‐thirds of the patients were recommended genetic testing, and subsequently 73.3% completed testing. Nearly three‐quarters, 26 out of 36 patients (72.2%), of our undiagnosed patients received a clinical and/or molecular diagnosis, which is imperative in managing their care. Twenty‐two individuals pursued genetic testing. In eight individuals (36%), molecular testing was diagnostic. However, in two individuals the molecular diagnosis did not completely explain the phenotype. However, there are still obstacles to genetic testing, such as cost of testing and insurance barriers. Almost all (91.4%) rated the Genodermatoses Clinic as “Very Good,” the top Press Ganey score. High patient satisfaction scores suggest a positive impact of the Genodermatoses clinic, emphasizing the importance to increase support for the clinical and administrative time needed for patients with genodermatoses.

    more » « less