skip to main content


Title: Learning Electronic Health Records through Hyperbolic Embedding of Medical Ontologies
Unplanned intensive care units (ICU) readmissions and in-hospital mortality of patients are two important metrics for evaluating the quality of hospital care. Identifying patients with higher risk of readmission to ICU or of mortality can not only protect those patients from potential dangers, but also reduce the high costs of healthcare. In this work, we propose a new method to incorporate information from the Electronic Health Records (EHRs) of patients and utilize hyperbolic embeddings of a medical ontology (i.e., ICD-9) in the prediction model. The results prove the effectiveness of our method and show that hyperbolic embeddings of ontological concepts give promising performance.  more » « less
Award ID(s):
1747798
NSF-PAR ID:
10131156
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
Page Range / eLocation ID:
338 to 346
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. BACKGROUND:

    Classification of perioperative risk is important for patient care, resource allocation, and guiding shared decision-making. Using discriminative features from the electronic health record (EHR), machine-learning algorithms can create digital phenotypes among heterogenous populations, representing distinct patient subpopulations grouped by shared characteristics, from which we can personalize care, anticipate clinical care trajectories, and explore therapies. We hypothesized that digital phenotypes in preoperative settings are associated with postoperative adverse events including in-hospital and 30-day mortality, 30-day surgical redo, intensive care unit (ICU) admission, and hospital length of stay (LOS).

    METHODS:

    We identified all laminectomies, colectomies, and thoracic surgeries performed over a 9-year period from a large hospital system. Seventy-seven readily extractable preoperative features were first selected from clinical consensus, including demographics, medical history, and lab results. Three surgery-specific datasets were built and split into derivation and validation cohorts using chronological occurrence. Consensusk-means clustering was performed independently on each derivation cohort, from which phenotypes’ characteristics were explored. Cluster assignments were used to train a random forest model to assign patient phenotypes in validation cohorts. We reconducted descriptive analyses on validation cohorts to confirm the similarity of patient characteristics with derivation cohorts, and quantified the association of each phenotype with postoperative adverse events by using the area under receiver operating characteristic curve (AUROC). We compared our approach to American Society of Anesthesiologists (ASA) alone and investigated a combination of our phenotypes with the ASA score.

    RESULTS:

    A total of 7251 patients met inclusion criteria, of which 2770 were held out in a validation dataset based on chronological occurrence. Using segmentation metrics and clinical consensus, 3 distinct phenotypes were created for each surgery. The main features used for segmentation included urgency of the procedure, preoperative LOS, age, and comorbidities. The most relevant characteristics varied for each of the 3 surgeries. Low-risk phenotype alpha was the most common (2039 of 2770, 74%), while high-risk phenotype gamma was the rarest (302 of 2770, 11%). Adverse outcomes progressively increased from phenotypes alpha to gamma, including 30-day mortality (0.3%, 2.1%, and 6.0%, respectively), in-hospital mortality (0.2%, 2.3%, and 7.3%), and prolonged hospital LOS (3.4%, 22.1%, and 25.8%). When combined with the ASA score, digital phenotypes achieved higher AUROC than the ASA score alone (hospital mortality: 0.91 vs 0.84; prolonged hospitalization: 0.80 vs 0.71).

    CONCLUSIONS:

    For 3 frequently performed surgeries, we identified 3 digital phenotypes. The typical profiles of each phenotype were described and could be used to anticipate adverse postoperative events.

     
    more » « less
  2. Abstract

    Traditional methods for assessing illness severity and predicting in-hospital mortality among critically ill patients require time-consuming, error-prone calculations using static variable thresholds. These methods do not capitalize on the emerging availability of streaming electronic health record data or capture time-sensitive individual physiological patterns, a critical task in the intensive care unit. We propose a novel acuity score framework (DeepSOFA) that leverages temporal measurements and interpretable deep learning models to assess illness severity at any point during an ICU stay. We compare DeepSOFA with SOFA (Sequential Organ Failure Assessment) baseline models using the same model inputs and find that at any point during an ICU admission, DeepSOFA yields significantly more accurate predictions of in-hospital mortality. A DeepSOFA model developed in a public database and validated in a single institutional cohort had a mean AUC for the entire ICU stay of 0.90 (95% CI 0.90–0.91) compared with baseline SOFA models with mean AUC 0.79 (95% CI 0.79–0.80) and 0.85 (95% CI 0.85–0.86). Deep models are well-suited to identify ICU patients in need of life-saving interventions prior to the occurrence of an unexpected adverse event and inform shared decision-making processes among patients, providers, and families regarding goals of care and optimal resource utilization.

     
    more » « less
  3. Abstract Background Sepsis is a heterogeneous syndrome, and the identification of clinical subphenotypes is essential. Although organ dysfunction is a defining element of sepsis, subphenotypes of differential trajectory are not well studied. We sought to identify distinct Sequential Organ Failure Assessment (SOFA) score trajectory-based subphenotypes in sepsis. Methods We created 72-h SOFA score trajectories in patients with sepsis from four diverse intensive care unit (ICU) cohorts. We then used dynamic time warping (DTW) to compute heterogeneous SOFA trajectory similarities and hierarchical agglomerative clustering (HAC) to identify trajectory-based subphenotypes. Patient characteristics were compared between subphenotypes and a random forest model was developed to predict subphenotype membership at 6 and 24 h after being admitted to the ICU. The model was tested on three validation cohorts. Sensitivity analyses were performed with alternative clustering methodologies. Results A total of 4678, 3665, 12,282, and 4804 unique sepsis patients were included in development and three validation cohorts, respectively. Four subphenotypes were identified in the development cohort: Rapidly Worsening ( n  = 612, 13.1%), Delayed Worsening ( n  = 960, 20.5%), Rapidly Improving ( n  = 1932, 41.3%), and Delayed Improving ( n  = 1174, 25.1%). Baseline characteristics, including the pattern of organ dysfunction, varied between subphenotypes. Rapidly Worsening was defined by a higher comorbidity burden, acidosis, and visceral organ dysfunction. Rapidly Improving was defined by vasopressor use without acidosis. Outcomes differed across the subphenotypes, Rapidly Worsening had the highest in-hospital mortality (28.3%, P -value < 0.001), despite a lower SOFA (mean: 4.5) at ICU admission compared to Rapidly Improving (mortality:5.5%, mean SOFA: 5.5). An overall prediction accuracy of 0.78 (95% CI, [0.77, 0.8]) was obtained at 6 h after ICU admission, which increased to 0.87 (95% CI, [0.86, 0.88]) at 24 h. Similar subphenotypes were replicated in three validation cohorts. The majority of patients with sepsis have an improving phenotype with a lower mortality risk; however, they make up over 20% of all deaths due to their larger numbers. Conclusions Four novel, clinically-defined, trajectory-based sepsis subphenotypes were identified and validated. Identifying trajectory-based subphenotypes has immediate implications for the powering and predictive enrichment of clinical trials. Understanding the pathophysiology of these differential trajectories may reveal unanticipated therapeutic targets and identify more precise populations and endpoints for clinical trials. 
    more » « less
  4. Intensive care occupancy is an important indicator of health care stress that has been used to guide policy decisions during the COVID‐19 pandemic. Toward reliable decision‐making as a pandemic progresses, estimating the rates at which patients are admitted to and discharged from hospitals and intensive care units (ICUs) is crucial. Since individual‐level hospital data are rarely available to modelers in each geographic locality of interest, it is important to develop tools for inferring these rates from publicly available daily numbers of hospital and ICU beds occupied. We develop such an estimation approach based on an immigration‐death process that models fluctuations of ICU occupancy. Our flexible framework allows for immigration and death rates to depend on covariates, such as hospital bed occupancy and daily SARS‐CoV‐2 test positivity rate, which may drive changes in hospital ICU operations. We demonstrate via simulation studies that the proposed method performs well on noisy time series data and apply our statistical framework to hospitalization data from the University of California, Irvine (UCI) Health and Orange County, California. By introducing a likelihood‐based framework where immigration and death rates can vary with covariates, we find, through rigorous model selection, that hospitalization and positivity rates are crucial covariates for modeling ICU stay dynamics and validate our per‐patient ICU stay estimates using anonymized patient‐level UCI hospital data.

     
    more » « less
  5. Abstract Objective: The aim of this study was to investigate the performance of key hospital units associated with emergency care of both routine emergency and pandemic (COVID-19) patients under capacity enhancing strategies. Methods: This investigation was conducted using whole-hospital, resource-constrained, patient-based, stochastic, discrete-event, simulation models of a generic 200-bed urban U.S. tertiary hospital serving routine emergency and COVID-19 patients. Systematically designed numerical experiments were conducted to provide generalizable insights into how hospital functionality may be affected by the care of COVID-19 pandemic patients along specially designated care paths, under changing pandemic situations, from getting ready to turning all of its resources to pandemic care. Results: Several insights are presented. For example, each day of reduction in average ICU length of stay increases intensive care unit patient throughput by up to 24% for high COVID-19 daily patient arrival levels. The potential of 5 specific interventions and 2 critical shifts in care strategies to significantly increase hospital capacity is also described. Conclusions: These estimates enable hospitals to repurpose space, modify operations, implement crisis standards of care, collaborate with other health care facilities, or request external support, thereby increasing the likelihood that arriving patients will find an open staffed bed when 1 is needed. 
    more » « less