skip to main content


Title: Interpretable machine learning models for hospital readmission prediction: a two-step extracted regression tree approach
Abstract Background

Advanced machine learning models have received wide attention in assisting medical decision making due to the greater accuracy they can achieve. However, their limited interpretability imposes barriers for practitioners to adopt them. Recent advancements in interpretable machine learning tools allow us to look inside the black box of advanced prediction methods to extract interpretable models while maintaining similar prediction accuracy, but few studies have investigated the specific hospital readmission prediction problem with this spirit.

Methods

Our goal is to develop a machine-learning (ML) algorithm that can predict 30- and 90- day hospital readmissions as accurately as black box algorithms while providing medically interpretable insights into readmission risk factors. Leveraging a state-of-art interpretable ML model, we use a two-step Extracted Regression Tree approach to achieve this goal. In the first step, we train a black box prediction algorithm. In the second step, we extract a regression tree from the output of the black box algorithm that allows direct interpretation of medically relevant risk factors. We use data from a large teaching hospital in Asia to learn the ML model and verify our two-step approach.

Results

The two-step method can obtain similar prediction performance as the best black box model, such as Neural Networks, measured by three metrics: accuracy, the Area Under the Curve (AUC) and the Area Under the Precision-Recall Curve (AUPRC), while maintaining interpretability. Further, to examine whether the prediction results match the known medical insights (i.e., the model is truly interpretable and produces reasonable results), we show that key readmission risk factors extracted by the two-step approach are consistent with those found in the medical literature.

Conclusions

The proposed two-step approach yields meaningful prediction results that are both accurate and interpretable. This study suggests a viable means to improve the trust of machine learning based models in clinical practice for predicting readmissions through the two-step approach.

 
more » « less
NSF-PAR ID:
10419903
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
BMC Medical Informatics and Decision Making
Volume:
23
Issue:
1
ISSN:
1472-6947
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Machine learning models with explainable predictions are increasingly sought after, especially for real-world, mission-critical applications that require bias detection and risk mitigation. Inherent interpretability, where a model is designed from the ground-up for interpretability, provides intuitive insights and transparent explanations on model prediction and performance. In this paper, we present COLABEL, an approach to build interpretable models with explanations rooted in the ground truth. We demonstrate COLABEL in a vehicle feature extraction application in the context of vehicle make-model recognition (VMMR). By construction, COLABEL performs VMMR with a composite of interpretable features such as vehicle color, type, and make, all based on interpretable annotations of the ground truth labels. First, COLABEL performs corroborative integration to join multiple datasets that each have a subset of desired annotations of color, type, and make. Then, COLABEL uses decomposable branches to extract complementary features corresponding to desired annotations. Finally, COLABEL fuses them together for final predictions. During feature fusion, COLABEL harmonizes complementary branches so that VMMR features are compatible with each other and can be projected to the same semantic space for classification. With inherent interpretability, COLABEL achieves superior performance to the state-of-the-art black-box models, with accuracy of 0.98, 0.95, and 0.94 on CompCars, Cars196, and BoxCars116K, respectively. COLABEL provides intuitive explanations due to constructive interpretability, and subsequently achieves high accuracy and usability in mission-critical situations. 
    more » « less
  2. Abstract Background

    Unlike linear models which are traditionally used to study all-cause mortality, complex machine learning models can capture non-linear interrelations and provide opportunities to identify unexplored risk factors. Explainable artificial intelligence can improve prediction accuracy over linear models and reveal great insights into outcomes like mortality. This paper comprehensively analyzes all-cause mortality by explaining complex machine learning models.

    Methods

    We propose the IMPACT framework that uses XAI technique to explain a state-of-the-art tree ensemble mortality prediction model. We apply IMPACT to understand all-cause mortality for 1-, 3-, 5-, and 10-year follow-up times within the NHANES dataset, which contains 47,261 samples and 151 features.

    Results

    We show that IMPACT models achieve higher accuracy than linear models and neural networks. Using IMPACT, we identify several overlooked risk factors and interaction effects. Furthermore, we identify relationships between laboratory features and mortality that may suggest adjusting established reference intervals. Finally, we develop highly accurate, efficient and interpretable mortality risk scores that can be used by medical professionals and individuals without medical expertise. We ensure generalizability by performing temporal validation of the mortality risk scores and external validation of important findings with the UK Biobank dataset.

    Conclusions

    IMPACT’s unique strength is the explainable prediction, which provides insights into the complex, non-linear relationships between mortality and features, while maintaining high accuracy. Our explainable risk scores could help individuals improve self-awareness of their health status and help clinicians identify patients with high risk. IMPACT takes a consequential step towards bringing contemporary developments in XAI to epidemiology.

     
    more » « less
  3. Abstract

    Machine learning has been suggested as a means of identifying individuals at greatest risk for hospital readmission, including psychiatric readmission. We sought to compare the performance of predictive models that use interpretable representations derived via topic modeling to the performance of human experts and nonexperts. We examined all 5076 admissions to a general psychiatry inpatient unit between 2009 and 2016 using electronic health records. We developed multiple models to predict 180-day readmission for these admissions based on features derived from narrative discharge summaries, augmented by baseline sociodemographic and clinical features. We developed models using a training set comprising 70% of the cohort and evaluated on the remaining 30%. Baseline models using demographic features for prediction achieved an area under the curve (AUC) of 0.675 [95% CI 0.674–0.676] on an independent testing set, while language-based models also incorporating bag-of-words features, discharge summaries topics identified by Latent Dirichlet allocation (LDA), and prior psychiatric admissions achieved AUC of 0.726 [95% CI 0.725–0.727]. To characterize the difficulty of the task, we also compared the performance of these classifiers to both expert and nonexpert human raters, with and without feedback, on a subset of 75 test cases. These models outperformed humans on average, including predictions by experienced psychiatrists. Typical note tokens or topics associated with readmission risk were related to pregnancy/postpartum state, family relationships, and psychosis.

     
    more » « less
  4. Abstract Background

    Acute neurological complications are some of the leading causes of death and disability in the U.S. The medical professionals that treat patients in this setting are tasked with deciding where (e.g., home or facility), how, and when to discharge these patients. It is important to be able to predict potential patient discharge outcomes as early as possible during the patient’s hospital stay and to know what factors influence the development of discharge planning. This study carried out two parallel experiments: A multi-class outcome (patient discharge targets of ‘home’, ‘nursing facility’, ‘rehab’, ‘death’) and binary class outcome (‘home’ vs. ‘non-home’). The goal of this study is to develop early predictive models for each experiment exploring which patient characteristics and clinical variables significantly influence discharge planning of patients based on the data that are available only within 24 h of their hospital admission. 

    Method

    Our methodology centers around building and training five different machine learning models followed by testing and tuning those models to find the best-suited predictor for each experiment with a dataset of 5,245 adult patients with neurological conditions taken from the eICU-CRD database.

    Results

    The results of this study show XGBoost to be the most effective model for predicting between four common discharge outcomes of ‘home’, ‘nursing facility’, ‘rehab’, and ‘death’, with 71% average c-statistic. The XGBoost model was also the best-performer in the binary outcome experiment with a c-statistic of 76%. This article also explores the accuracy, reliability, and interpretability of the best performing models in each experiment by identifying and analyzing the features that are most impactful to the predictions.

    Conclusions

    The acceptable accuracy and interpretability of the predictive models based on early admission data suggests that the models can be used in a suggestive context to help guide healthcare providers in efforts of planning effective and equitable discharge recommendations.

     
    more » « less
  5. This work presents SeizFt—a novel seizure detection framework that utilizes machine learning to automatically detect seizures using wearable SensorDot EEG data. Inspired by interpretable sleep staging, our novel approach employs a unique combination of data augmentation, meaningful feature extraction, and an ensemble of decision trees to improve resilience to variations in EEG and to increase the capacity to generalize to unseen data. Fourier Transform (FT) Surrogates were utilized to increase sample size and improve the class balance between labeled non-seizure and seizure epochs. To enhance model stability and accuracy, SeizFt utilizes an ensemble of decision trees through the CatBoost classifier to classify each second of EEG recording as seizure or non-seizure. The SeizIt1 dataset was used for training, and the SeizIt2 dataset for validation and testing. Model performance for seizure detection was evaluated using two primary metrics: sensitivity using the any-overlap method (OVLP) and False Alarm (FA) rate using epoch-based scoring (EPOCH). Notably, SeizFt placed first among an array of state-of-the-art seizure detection algorithms as part of the Seizure Detection Grand Challenge at the 2023 International Conference on Acoustics, Speech, and Signal Processing (ICASSP). SeizFt outperformed state-of-the-art black-box models in accurate seizure detection and minimized false alarms, obtaining a total score of 40.15, combining OVLP and EPOCH across two tasks and representing an improvement of ~30% from the next best approach. The interpretability of SeizFt is a key advantage, as it fosters trust and accountability among healthcare professionals. The most predictive seizure detection features extracted from SeizFt were: delta wave, interquartile range, standard deviation, total absolute power, theta wave, the ratio of delta to theta, binned entropy, Hjorth complexity, delta + theta, and Higuchi fractal dimension. In conclusion, the successful application of SeizFt to wearable SensorDot data suggests its potential for real-time, continuous monitoring to improve personalized medicine for epilepsy.

     
    more » « less