skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Multi-Subset Approach to Early Sepsis Prediction
Sepsis is a life-threatening organ malfunction caused by the host's inability to fight infection, which can lead to death without proper and immediate treatment. Therefore, early diagnosis and medical treatment of sepsis in critically ill populations at high risk for sepsis and sepsis-associated mortality are vital to providing the patient with rapid therapy. Studies show that advancing sepsis detection by 6 hours leads to earlier administration of antibiotics, which is associated with improved mortality. However, clinical scores like Sequential Organ Failure Assessment (SOFA) are not applicable for early prediction, while machine learning algorithms can help capture the progressing pattern for early prediction. Therefore, we aim to develop a machine learning algorithm that predicts sepsis onset 6 hours before it is suspected clinically. Although some machine learning algorithms have been applied to sepsis prediction, many of them did not consider the fact that six hours is not a small gap. To overcome this big gap challenge, we explore a multi-subset approach in which the likelihood of sepsis occurring earlier than 6 hours is output from a previous subset and feed to the target subset as additional features. Moreover, we use the hourly sampled data like vital signs in an observation window to derive a temporal change trend to further assist, which however is often ignored by previous studies. Our empirical study shows that both the multi-subset approach to alleviating the 6-hour gap and the added temporal trend features can help improve the performance of sepsis-related early prediction.  more » « less
Award ID(s):
2104270
PAR ID:
10545014
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
IEEE
Date Published:
ISBN:
979-8-3503-2759-5
Page Range / eLocation ID:
1335 to 1341
Format(s):
Medium: X
Location:
Las Vegas, NV, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. Keim-Malpass, Jessica (Ed.)
    During the early stages of hospital admission, clinicians use limited information to make decisions as patient acuity evolves. We hypothesized that clustering analysis of vital signs measured within six hours of hospital admission would reveal distinct patient phenotypes with unique pathophysiological signatures and clinical outcomes. We created a longitudinal electronic health record dataset for 75,762 adult patient admissions to a tertiary care center in 2014–2016 lasting six hours or longer. Physiotypes were derived via unsupervised machine learning in a training cohort of 41,502 patients applying consensus k -means clustering to six vital signs measured within six hours of admission. Reproducibility and correlation with clinical biomarkers and outcomes were assessed in validation cohort of 17,415 patients and testing cohort of 16,845 patients. Training, validation, and testing cohorts had similar age (54–55 years) and sex (55% female), distributions. There were four distinct clusters. Physiotype A had physiologic signals consistent with early vasoplegia, hypothermia, and low-grade inflammation and favorable short-and long-term clinical outcomes despite early, severe illness. Physiotype B exhibited early tachycardia, tachypnea, and hypoxemia followed by the highest incidence of prolonged respiratory insufficiency, sepsis, acute kidney injury, and short- and long-term mortality. Physiotype C had minimal early physiological derangement and favorable clinical outcomes. Physiotype D had the greatest prevalence of chronic cardiovascular and kidney disease, presented with severely elevated blood pressure, and had good short-term outcomes but suffered increased 3-year mortality. Comparing sequential organ failure assessment (SOFA) scores across physiotypes demonstrated that clustering did not simply recapitulate previously established acuity assessments. In a heterogeneous cohort of hospitalized patients, unsupervised machine learning techniques applied to routine, early vital sign data identified physiotypes with unique disease categories and distinct clinical outcomes. This approach has the potential to augment understanding of pathophysiology by distilling thousands of disease states into a few physiological signatures. 
    more » « less
  2. It is shown that appropriate therapeutic management at early stages of sepsis are crucial for preventing further deterioration and irreversible organ damage. Although previous studies considered the cellular and physiological responses as the components of sepsis-related predictive models, temporal connections among the responses have not been widely studied. The objective of this study is to investigate simultaneous changes in cellular and physiological responses represented by 16 clinical variables contributing to seven organ system dysfunctions in patients with sepsis to predict in-hospital mortality. Organ dysfunctions were represented by undirected weighted network models composed of: i) nodes (i.e., 16 clinical variables and three biomarkers including procalcitonin, C-reactive protein, and sedimentation rate), ii) edges (i.e., connection between pair of nodes representing simultaneous dysfunctions), and iii) weights representing the persistence of the co-occurrence of two dysfunctions. Data was collected from 13,367 adult patients (corresponding to 17,953 visits) admitted to the study hospital from July 1, 2013, to December 31, 2015. The study population were categorized based on clinical criteria representing sepsis progression to identify different subpopulations. The findings quantify the optimal window for defining the simultaneity of two dysfunctions, the network properties corresponding to different subpopulations, the discriminatory patterns of simultaneous dysfunctions among subpopulations and in-hospital mortality prediction. The results show that the level of persistence of simultaneous dysfunctions are subpopulation-specific. Insights from this study regarding optimal thresholds of the persistence and combination of simultaneous organ dysfunctions can inform policies to personalize the in-hospital mortality prediction. 
    more » « less
  3. Sepsis, a dysregulated immune-mediated host response to infection, is lethal, prevalent, and costly. It’s early detection has the potential to drastically reduce morbidity/mortality. We have developed a real-time cloud-based application that predicts onset-time of sepsis based on live ICU data and provides clinicians with actionable visual alerts. Clinicians and nurses can examine these alerts and initiate appropriate interventions. The prediction engine (DeepAISE) is a Deep Learning-based algorithm trained to reliably predict sepsis 4-6 hours in advance of clinical recognition. A scalable, cloud-based, system continuously streams bedside data and uses the prediction engine to generate hourly scores and displays these to clinicians. Interoperability is achieved through the use of FHIR resources and APIs. This system is monitoring ~100 patients on a daily basis at the Emory Tele-ICU center, and has been shown to reliably predict onset of sepsis with an AUC of 0.9. 
    more » « less
  4. Abstract Background Sepsis is a heterogeneous syndrome, and the identification of clinical subphenotypes is essential. Although organ dysfunction is a defining element of sepsis, subphenotypes of differential trajectory are not well studied. We sought to identify distinct Sequential Organ Failure Assessment (SOFA) score trajectory-based subphenotypes in sepsis. Methods We created 72-h SOFA score trajectories in patients with sepsis from four diverse intensive care unit (ICU) cohorts. We then used dynamic time warping (DTW) to compute heterogeneous SOFA trajectory similarities and hierarchical agglomerative clustering (HAC) to identify trajectory-based subphenotypes. Patient characteristics were compared between subphenotypes and a random forest model was developed to predict subphenotype membership at 6 and 24 h after being admitted to the ICU. The model was tested on three validation cohorts. Sensitivity analyses were performed with alternative clustering methodologies. Results A total of 4678, 3665, 12,282, and 4804 unique sepsis patients were included in development and three validation cohorts, respectively. Four subphenotypes were identified in the development cohort: Rapidly Worsening ( n  = 612, 13.1%), Delayed Worsening ( n  = 960, 20.5%), Rapidly Improving ( n  = 1932, 41.3%), and Delayed Improving ( n  = 1174, 25.1%). Baseline characteristics, including the pattern of organ dysfunction, varied between subphenotypes. Rapidly Worsening was defined by a higher comorbidity burden, acidosis, and visceral organ dysfunction. Rapidly Improving was defined by vasopressor use without acidosis. Outcomes differed across the subphenotypes, Rapidly Worsening had the highest in-hospital mortality (28.3%, P -value < 0.001), despite a lower SOFA (mean: 4.5) at ICU admission compared to Rapidly Improving (mortality:5.5%, mean SOFA: 5.5). An overall prediction accuracy of 0.78 (95% CI, [0.77, 0.8]) was obtained at 6 h after ICU admission, which increased to 0.87 (95% CI, [0.86, 0.88]) at 24 h. Similar subphenotypes were replicated in three validation cohorts. The majority of patients with sepsis have an improving phenotype with a lower mortality risk; however, they make up over 20% of all deaths due to their larger numbers. Conclusions Four novel, clinically-defined, trajectory-based sepsis subphenotypes were identified and validated. Identifying trajectory-based subphenotypes has immediate implications for the powering and predictive enrichment of clinical trials. Understanding the pathophysiology of these differential trajectories may reveal unanticipated therapeutic targets and identify more precise populations and endpoints for clinical trials. 
    more » « less
  5. null (Ed.)
    In order to manage the public health crisis associated with COVID-19, it is critically important that healthcare workers can quickly identify high-risk patients in order to provide effective treatment with limited resources. Statistical learning tools have the potential to help predict serious infection early-on in the progression of the disease. However, many of these techniques are unable to take full advantage of temporal data on a per-patient basis as they handle the problem as a single-instance classification. Furthermore, these algorithms rely on complete data to make their predictions. In this work, we present a novel approach to handle the temporal and missing data problems, simultaneously; our proposed Simultaneous Imputation-Multi Instance Support Vector Machine method illustrates how multiple instance learning techniques and low-rank data imputation can be utilized to accurately predict clinical outcomes of COVID-19 patients. We compare our approach against recent methods used to predict outcomes on a public dataset with a cohort of 361 COVID-19 positive patients. In addition to improved prediction performance early on in the progression of the disease, our method identifies a collection of biomarkers associated with the liver, immune system, and blood, that deserve additional study and may provide additional insight into causes of patient mortality due to COVID-19. We publish the source code for our method online. 
    more » « less