skip to main content

Title: Development and External Validation of a Machine Learning Tool to Rule Out COVID-19 Among Adults in the Emergency Department Using Routine Blood Tests: A Large, Multicenter, Real-World Study
Background Conventional diagnosis of COVID-19 with reverse transcription polymerase chain reaction (RT-PCR) testing (hereafter, PCR) is associated with prolonged time to diagnosis and significant costs to run the test. The SARS-CoV-2 virus might lead to characteristic patterns in the results of widely available, routine blood tests that could be identified with machine learning methodologies. Machine learning modalities integrating findings from these common laboratory test results might accelerate ruling out COVID-19 in emergency department patients. Objective We sought to develop (ie, train and internally validate with cross-validation techniques) and externally validate a machine learning model to rule out COVID 19 using only routine blood tests among adults in emergency departments. Methods Using clinical data from emergency departments (EDs) from 66 US hospitals before the pandemic (before the end of December 2019) or during the pandemic (March-July 2020), we included patients aged ≥20 years in the study time frame. We excluded those with missing laboratory results. Model training used 2183 PCR-confirmed cases from 43 hospitals during the pandemic; negative controls were 10,000 prepandemic patients from the same hospitals. External validation used 23 hospitals with 1020 PCR-confirmed cases and 171,734 prepandemic negative controls. The main outcome was COVID 19 status predicted using same-day more » routine laboratory results. Model performance was assessed with area under the receiver operating characteristic (AUROC) curve as well as sensitivity, specificity, and negative predictive value (NPV). Results Of 192,779 patients included in the training, external validation, and sensitivity data sets (median age decile 50 [IQR 30-60] years, 40.5% male [78,249/192,779]), AUROC for training and external validation was 0.91 (95% CI 0.90-0.92). Using a risk score cutoff of 1.0 (out of 100) in the external validation data set, the model achieved sensitivity of 95.9% and specificity of 41.7%; with a cutoff of 2.0, sensitivity was 92.6% and specificity was 59.9%. At the cutoff of 2.0, the NPVs at a prevalence of 1%, 10%, and 20% were 99.9%, 98.6%, and 97%, respectively. Conclusions A machine learning model developed with multicenter clinical data integrating commonly collected ED laboratory data demonstrated high rule-out accuracy for COVID-19 status, and might inform selective use of PCR-based testing. « less
; ; ; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Journal of Medical Internet Research
Page Range or eLocation-ID:
Sponsoring Org:
National Science Foundation
More Like this
  1. Background . New York City (NYC) experienced an initial surge and gradual decline in the number of SARS-CoV-2-confirmed cases in 2020. A change in the pattern of laboratory test results in COVID-19 patients over this time has not been reported or correlated with patient outcome. Methods . We performed a retrospective study of routine laboratory and SARS-CoV-2 RT-PCR test results from 5,785 patients evaluated in a NYC hospital emergency department from March to June employing machine learning analysis. Results . A COVID-19 high-risk laboratory test result profile (COVID19-HRP), consisting of 21 routine blood tests, was identified to characterize the SARS-CoV-2more »patients. Approximately half of the SARS-CoV-2 positive patients had the distinct COVID19-HRP that separated them from SARS-CoV-2 negative patients. SARS-CoV-2 patients with the COVID19-HRP had higher SARS-CoV-2 viral loads, determined by cycle threshold values from the RT-PCR, and poorer clinical outcome compared to other positive patients without the COVID12-HRP. Furthermore, the percentage of SARS-CoV-2 patients with the COVID19-HRP has significantly decreased from March/April to May/June. Notably, viral load in the SARS-CoV-2 patients declined, and their laboratory profile became less distinguishable from SARS-CoV-2 negative patients in the later phase. Conclusions . Our longitudinal analysis illustrates the temporal change of laboratory test result profile in SARS-CoV-2 patients and the COVID-19 evolvement in a US epicenter. This analysis could become an important tool in COVID-19 population disease severity tracking and prediction. In addition, this analysis may play an important role in prioritizing high-risk patients, assisting in patient triaging and optimizing the usage of resources.« less
  2. Background The novel coronavirus SARS-CoV-2 and its associated disease, COVID-19, have caused worldwide disruption, leading countries to take drastic measures to address the progression of the disease. As SARS-CoV-2 continues to spread, hospitals are struggling to allocate resources to patients who are most at risk. In this context, it has become important to develop models that can accurately predict the severity of infection of hospitalized patients to help guide triage, planning, and resource allocation. Objective The aim of this study was to develop accurate models to predict the mortality of hospitalized patients with COVID-19 using basic demographics and easily obtainablemore »laboratory data. Methods We performed a retrospective study of 375 hospitalized patients with COVID-19 in Wuhan, China. The patients were randomly split into derivation and validation cohorts. Regularized logistic regression and support vector machine classifiers were trained on the derivation cohort, and accuracy metrics (F1 scores) were computed on the validation cohort. Two types of models were developed: the first type used laboratory findings from the entire length of the patient’s hospital stay, and the second type used laboratory findings that were obtained no later than 12 hours after admission. The models were further validated on a multicenter external cohort of 542 patients. Results Of the 375 patients with COVID-19, 174 (46.4%) died of the infection. The study cohort was composed of 224/375 men (59.7%) and 151/375 women (40.3%), with a mean age of 58.83 years (SD 16.46). The models developed using data from throughout the patients’ length of stay demonstrated accuracies as high as 97%, whereas the models with admission laboratory variables possessed accuracies of up to 93%. The latter models predicted patient outcomes an average of 11.5 days in advance. Key variables such as lactate dehydrogenase, high-sensitivity C-reactive protein, and percentage of lymphocytes in the blood were indicated by the models. In line with previous studies, age was also found to be an important variable in predicting mortality. In particular, the mean age of patients who survived COVID-19 infection (50.23 years, SD 15.02) was significantly lower than the mean age of patients who died of the infection (68.75 years, SD 11.83; P<.001). Conclusions Machine learning models can be successfully employed to accurately predict outcomes of patients with COVID-19. Our models achieved high accuracies and could predict outcomes more than one week in advance; this promising result suggests that these models can be highly useful for resource allocation in hospitals.« less
  3. Abstract

    The strain on healthcare resources brought forth by the recent COVID-19 pandemic has highlighted the need for efficient resource planning and allocation through the prediction of future consumption. Machine learning can predict resource utilization such as the need for hospitalization based on past medical data stored in electronic medical records (EMR). We conducted this study on 3194 patients (46% male with mean age 56.7 (±16.8), 56% African American, 7% Hispanic) flagged as COVID-19 positive cases in 12 centers under Emory Healthcare network from February 2020 to September 2020, to assess whether a COVID-19 positive patient’s need for hospitalization canmore »be predicted at the time of RT-PCR test using the EMR data prior to the test. Five main modalities of EMR, i.e., demographics, medication, past medical procedures, comorbidities, and laboratory results, were used as features for predictive modeling, both individually and fused together using late, middle, and early fusion. Models were evaluated in terms of precision, recall, F1-score (within 95% confidence interval). The early fusion model is the most effective predictor with 84% overall F1-score [CI 82.1–86.1]. The predictive performance of the model drops by 6 % when using recent clinical data while omitting the long-term medical history. Feature importance analysis indicates that history of cardiovascular disease, emergency room visits in the past year prior to testing, and demographic factors are predictive of the disease trajectory. We conclude that fusion modeling using medical history and current treatment data can forecast the need for hospitalization for patients infected with COVID-19 at the time of the RT-PCR test.

    « less
  4. Calderaro, Adriana (Ed.)
    The World Health Organization (WHO) declared coronavirus disease-2019 (COVID-19) a global pandemic on 11 March 2020. In Ecuador, the first case of COVID-19 was recorded on 29 February 2020. Despite efforts to control its spread, SARS-CoV-2 overran the Ecuadorian public health system, which became one of the most affected in Latin America on 24 April 2020. The Hospital General del Sur de Quito (HGSQ) had to transition from a general to a specific COVID-19 health center in a short period of time to fulfill the health demand from patients with respiratory afflictions. Here, we summarized the implementations applied in themore »HGSQ to become a COVID-19 exclusive hospital, including the rearrangement of hospital rooms and a triage strategy based on a severity score calculated through an artificial intelligence (AI)-assisted chest computed tomography (CT). Moreover, we present clinical, epidemiological, and laboratory data from 75 laboratory tested COVID-19 patients, which represent the first outbreak of Quito city. The majority of patients were male with a median age of 50 years. We found differences in laboratory parameters between intensive care unit (ICU) and non-ICU cases considering C-reactive protein, lactate dehydrogenase, and lymphocytes. Sensitivity and specificity of the AI-assisted chest CT were 21.4% and 66.7%, respectively, when considering a score >70%; regardless, this system became a cornerstone of hospital triage due to the lack of RT-PCR testing and timely results. If health workers act as vectors of SARS-CoV-2 at their domiciles, they can seed outbreaks that might put 1,879,047 people at risk of infection within 15 km around the hospital. Despite our limited sample size, the information presented can be used as a local example that might aid future responses in low and middle-income countries facing respiratory transmitted epidemics.« less
  5. Abstract Objective: The aim of this study was to investigate the performance of key hospital units associated with emergency care of both routine emergency and pandemic (COVID-19) patients under capacity enhancing strategies. Methods: This investigation was conducted using whole-hospital, resource-constrained, patient-based, stochastic, discrete-event, simulation models of a generic 200-bed urban U.S. tertiary hospital serving routine emergency and COVID-19 patients. Systematically designed numerical experiments were conducted to provide generalizable insights into how hospital functionality may be affected by the care of COVID-19 pandemic patients along specially designated care paths, under changing pandemic situations, from getting ready to turning all of itsmore »resources to pandemic care. Results: Several insights are presented. For example, each day of reduction in average ICU length of stay increases intensive care unit patient throughput by up to 24% for high COVID-19 daily patient arrival levels. The potential of 5 specific interventions and 2 critical shifts in care strategies to significantly increase hospital capacity is also described. Conclusions: These estimates enable hospitals to repurpose space, modify operations, implement crisis standards of care, collaborate with other health care facilities, or request external support, thereby increasing the likelihood that arriving patients will find an open staffed bed when 1 is needed.« less