skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Data heterogeneity in federated learning with Electronic Health Records: Case studies of risk prediction for acute kidney injury and sepsis diseases in critical care
With the wider availability of healthcare data such as Electronic Health Records (EHR), more and more data-driven based approaches have been proposed to improve the quality-of-care delivery. Predictive modeling, which aims at building computational models for predicting clinical risk, is a popular research topic in healthcare analytics. However, concerns about privacy of healthcare data may hinder the development of effective predictive models that are generalizable because this often requires rich diverse data from multiple clinical institutions. Recently, federated learning (FL) has demonstrated promise in addressing this concern. However, data heterogeneity from different local participating sites may affect prediction performance of federated models. Due to acute kidney injury (AKI) and sepsis’ high prevalence among patients admitted to intensive care units (ICU), the early prediction of these conditions based on AI is an important topic in critical care medicine. In this study, we take AKI and sepsis onset risk prediction in ICU as two examples to explore the impact of data heterogeneity in the FL framework as well as compare performances across frameworks. We built predictive models based on local, pooled, and FL frameworks using EHR data across multiple hospitals. The local framework only used data from each site itself. The pooled framework combined data from all sites. In the FL framework, each local site did not have access to other sites’ data. A model was updated locally, and its parameters were shared to a central aggregator, which was used to update the federated model’s parameters and then subsequently, shared with each site. We found models built within a FL framework outperformed local counterparts. Then, we analyzed variable importance discrepancies across sites and frameworks. Finally, we explored potential sources of the heterogeneity within the EHR data. The different distributions of demographic profiles, medication use, and site information contributed to data heterogeneity.  more » « less
Award ID(s):
1750326 2212175
PAR ID:
10438310
Author(s) / Creator(s):
; ; ; ;
Editor(s):
Frasch, Martin G.
Date Published:
Journal Name:
PLOS Digital Health
Volume:
2
Issue:
3
ISSN:
2767-3170
Page Range / eLocation ID:
e0000117
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Accurate prediction and monitoring of patient health in the intensive care unit can inform shared decisions regarding appropriateness of care delivery, risk-reduction strategies, and intensive care resource use. Traditionally, algorithmic solutions for patient outcome prediction rely solely on data available from electronic health records (EHR). In this pilot study, we explore the benefits of augmenting existing EHR data with novel measurements from wrist-worn activity sensors as part of a clinical environment known as the Intelligent ICU. We implemented temporal deep learning models based on two distinct sources of patient data: (1) routinely measured vital signs from electronic health records, and (2) activity data collected from wearable sensors. As a proxy for illness severity, our models predicted whether patients leaving the intensive care unit would be successfully or unsuccessfully discharged from the hospital. We overcome the challenge of small sample size in our prospective cohort by applying deep transfer learning using EHR data from a much larger cohort of traditional ICU patients. Our experiments quantify added utility of non-traditional measurements for predicting patient health, especially when applying a transfer learning procedure to small novel Intelligent ICU cohorts of critically ill patients. 
    more » « less
  2. The predictive Intensive Care Unit (ICU) scoring system plays an important role in ICU management for its capability of predicting important outcomes, especially mortality. There are many scoring systems that have been developed and used in the ICU. These scoring systems are primarily based on the structured clinical data contained in the electronic health record (EHR), which may suffer the loss of the important clinical information contained in the narratives and images. In this work, we build a deep learning based survival prediction model with multimodality data to predict ICU-mortality. Four sets of features are investigated: (1) physiological measurements of Simplified Acute Physiology Score (SAPS) II, (2) common thorax diseases predefined by radiologists, (3) BERT-based text representations, and (4) chest X-ray image features. We use the Medical Information Mart for Intensive Care IV (MIMIC-IV) dataset to evaluate the proposed model. Our model achieves the average C-index of 0.7847 (95% confidence interval, 0.7625–0.8068), which substantially exceeds that of the baseline with SAPS-II features (0.7477 (0.7238–0.7716)). Ablation studies further demonstrate the contributions of pre-defined labels (2.12%), text features (2.68%), and image features (2.96%). Our model achieves a higher average C-index than the traditional machine learning methods under the same feature fusion setting, which suggests that the deep learning methods can outperform the traditional machine learning methods in ICU-mortality prediction. These results highlight the potential of deep learning models with multimodal information to enhance ICU-mortality prediction. We make our work publicly available at https://github.com/bionlplab/mimic-icu-mortality. 
    more » « less
  3. Abstract Background Sepsis is a heterogeneous syndrome, and the identification of clinical subphenotypes is essential. Although organ dysfunction is a defining element of sepsis, subphenotypes of differential trajectory are not well studied. We sought to identify distinct Sequential Organ Failure Assessment (SOFA) score trajectory-based subphenotypes in sepsis. Methods We created 72-h SOFA score trajectories in patients with sepsis from four diverse intensive care unit (ICU) cohorts. We then used dynamic time warping (DTW) to compute heterogeneous SOFA trajectory similarities and hierarchical agglomerative clustering (HAC) to identify trajectory-based subphenotypes. Patient characteristics were compared between subphenotypes and a random forest model was developed to predict subphenotype membership at 6 and 24 h after being admitted to the ICU. The model was tested on three validation cohorts. Sensitivity analyses were performed with alternative clustering methodologies. Results A total of 4678, 3665, 12,282, and 4804 unique sepsis patients were included in development and three validation cohorts, respectively. Four subphenotypes were identified in the development cohort: Rapidly Worsening ( n  = 612, 13.1%), Delayed Worsening ( n  = 960, 20.5%), Rapidly Improving ( n  = 1932, 41.3%), and Delayed Improving ( n  = 1174, 25.1%). Baseline characteristics, including the pattern of organ dysfunction, varied between subphenotypes. Rapidly Worsening was defined by a higher comorbidity burden, acidosis, and visceral organ dysfunction. Rapidly Improving was defined by vasopressor use without acidosis. Outcomes differed across the subphenotypes, Rapidly Worsening had the highest in-hospital mortality (28.3%, P -value < 0.001), despite a lower SOFA (mean: 4.5) at ICU admission compared to Rapidly Improving (mortality:5.5%, mean SOFA: 5.5). An overall prediction accuracy of 0.78 (95% CI, [0.77, 0.8]) was obtained at 6 h after ICU admission, which increased to 0.87 (95% CI, [0.86, 0.88]) at 24 h. Similar subphenotypes were replicated in three validation cohorts. The majority of patients with sepsis have an improving phenotype with a lower mortality risk; however, they make up over 20% of all deaths due to their larger numbers. Conclusions Four novel, clinically-defined, trajectory-based sepsis subphenotypes were identified and validated. Identifying trajectory-based subphenotypes has immediate implications for the powering and predictive enrichment of clinical trials. Understanding the pathophysiology of these differential trajectories may reveal unanticipated therapeutic targets and identify more precise populations and endpoints for clinical trials. 
    more » « less
  4. ABSTRACT Understanding clinical trajectories of sepsis patients is crucial for prognostication, resource planning, and to inform digital twin models of critical illness. This study aims to identify common clinical trajectories based on dynamic assessment of cardiorespiratory support using a validated electronic health record data that covers retrospective cohort of 19,177 patients with sepsis admitted to intensive care units (ICUs) of Mayo Clinic Hospitals over 8-year period. Patient trajectories were modeled from ICU admission up to 14 days using an unsupervised machine learning two-stage clustering method based on cardiorespiratory support in ICU and hospital discharge status. Of 19,177 patients, 42% were female with a median age of 65 (interquartile range [IQR], 55–76) years, The Acute Physiology, Age, and Chronic Health Evaluation III score of 70 (IQR, 56–87), hospital length of stay (LOS) of 7 (IQR, 4–12) days, and ICU LOS of 2 (IQR, 1–4) days. Four distinct trajectories were identified: fast recovery (27% with a mortality rate of 3.5% and median hospital LOS of 3 (IQR, 2–15) days), slow recovery (62% with a mortality rate of 3.6% and hospital LOS of 8 (IQR, 6–13) days), fast decline (4% with a mortality rate of 99.7% and hospital LOS of 1 (IQR, 0–1) day), and delayed decline (7% with a mortality rate of 97.9% and hospital LOS of 5 (IQR, 3–8) days). Distinct trajectories remained robust and were distinguished by Charlson Comorbidity Index, The Acute Physiology, Age, and Chronic Health Evaluation III scores, as well as day 1 and day 3 SOFA (P< 0.001 ANOVA). These findings provide a foundation for developing prediction models and digital twin decision support tools, improving both shared decision making and resource planning. 
    more » « less
  5. Recent advances in deep learning have shown many successful stories in smart healthcare applications with data-driven insight into improving clinical institutions’ quality of care. Excellent deep learning models are heavily data-driven. The more data trained, the more robust and more generalizable the performance of the deep learning model. However, pooling the medical data into centralized storage to train a robust deep learning model faces privacy, ownership, and strict regulation challenges. Federated learning resolves the previous challenges with a shared global deep learning model using a central aggregator server. At the same time, patient data remain with the local party, maintaining data anonymity and security. In this study, first, we provide a comprehensive, up-to-date review of research employing federated learning in healthcare applications. Second, we evaluate a set of recent challenges from a data-centric perspective in federated learning, such as data partitioning characteristics, data distributions, data protection mechanisms, and benchmark datasets. Finally, we point out several potential challenges and future research directions in healthcare applications. 
    more » « less