skip to main content

Title: Routine Laboratory Blood Tests Predict SARS-CoV-2 Infection Using Machine Learning
Abstract Background Accurate diagnostic strategies to identify SARS-CoV-2 positive individuals rapidly for management of patient care and protection of health care personnel are urgently needed. The predominant diagnostic test is viral RNA detection by RT-PCR from nasopharyngeal swabs specimens, however the results are not promptly obtainable in all patient care locations. Routine laboratory testing, in contrast, is readily available with a turn-around time (TAT) usually within 1-2 hours. Method We developed a machine learning model incorporating patient demographic features (age, sex, race) with 27 routine laboratory tests to predict an individual’s SARS-CoV-2 infection status. Laboratory testing results obtained within 2 days before the release of SARS-CoV-2 RT-PCR result were used to train a gradient boosting decision tree (GBDT) model from 3,356 SARS-CoV-2 RT-PCR tested patients (1,402 positive and 1,954 negative) evaluated at a metropolitan hospital. Results The model achieved an area under the receiver operating characteristic curve (AUC) of 0.854 (95% CI: 0.829-0.878). Application of this model to an independent patient dataset from a separate hospital resulted in a comparable AUC (0.838), validating the generalization of its use. Moreover, our model predicted initial SARS-CoV-2 RT-PCR positivity in 66% individuals whose RT-PCR result changed from negative to positive within 2 days. Conclusion more » This model employing routine laboratory test results offers opportunities for early and rapid identification of high-risk SARS-CoV-2 infected patients before their RT-PCR results are available. It may play an important role in assisting the identification of SARS-CoV-2 infected patients in areas where RT-PCR testing is not accessible due to financial or supply constraints. « less
; ; ; ; ; ; ; ; ; ; ;
Award ID(s):
1716432 1750326
Publication Date:
Journal Name:
Clinical Chemistry
Page Range or eLocation-ID:
1396 to 1404
Sponsoring Org:
National Science Foundation
More Like this
  1. Background . New York City (NYC) experienced an initial surge and gradual decline in the number of SARS-CoV-2-confirmed cases in 2020. A change in the pattern of laboratory test results in COVID-19 patients over this time has not been reported or correlated with patient outcome. Methods . We performed a retrospective study of routine laboratory and SARS-CoV-2 RT-PCR test results from 5,785 patients evaluated in a NYC hospital emergency department from March to June employing machine learning analysis. Results . A COVID-19 high-risk laboratory test result profile (COVID19-HRP), consisting of 21 routine blood tests, was identified to characterize the SARS-CoV-2 patients. Approximately half of the SARS-CoV-2 positive patients had the distinct COVID19-HRP that separated them from SARS-CoV-2 negative patients. SARS-CoV-2 patients with the COVID19-HRP had higher SARS-CoV-2 viral loads, determined by cycle threshold values from the RT-PCR, and poorer clinical outcome compared to other positive patients without the COVID12-HRP. Furthermore, the percentage of SARS-CoV-2 patients with the COVID19-HRP has significantly decreased from March/April to May/June. Notably, viral load in the SARS-CoV-2 patients declined, and their laboratory profile became less distinguishable from SARS-CoV-2 negative patients in the later phase. Conclusions . Our longitudinal analysis illustrates the temporal change of laboratory testmore »result profile in SARS-CoV-2 patients and the COVID-19 evolvement in a US epicenter. This analysis could become an important tool in COVID-19 population disease severity tracking and prediction. In addition, this analysis may play an important role in prioritizing high-risk patients, assisting in patient triaging and optimizing the usage of resources.« less
  2. Background Conventional diagnosis of COVID-19 with reverse transcription polymerase chain reaction (RT-PCR) testing (hereafter, PCR) is associated with prolonged time to diagnosis and significant costs to run the test. The SARS-CoV-2 virus might lead to characteristic patterns in the results of widely available, routine blood tests that could be identified with machine learning methodologies. Machine learning modalities integrating findings from these common laboratory test results might accelerate ruling out COVID-19 in emergency department patients. Objective We sought to develop (ie, train and internally validate with cross-validation techniques) and externally validate a machine learning model to rule out COVID 19 using only routine blood tests among adults in emergency departments. Methods Using clinical data from emergency departments (EDs) from 66 US hospitals before the pandemic (before the end of December 2019) or during the pandemic (March-July 2020), we included patients aged ≥20 years in the study time frame. We excluded those with missing laboratory results. Model training used 2183 PCR-confirmed cases from 43 hospitals during the pandemic; negative controls were 10,000 prepandemic patients from the same hospitals. External validation used 23 hospitals with 1020 PCR-confirmed cases and 171,734 prepandemic negative controls. The main outcome was COVID 19 status predicted using same-daymore »routine laboratory results. Model performance was assessed with area under the receiver operating characteristic (AUROC) curve as well as sensitivity, specificity, and negative predictive value (NPV). Results Of 192,779 patients included in the training, external validation, and sensitivity data sets (median age decile 50 [IQR 30-60] years, 40.5% male [78,249/192,779]), AUROC for training and external validation was 0.91 (95% CI 0.90-0.92). Using a risk score cutoff of 1.0 (out of 100) in the external validation data set, the model achieved sensitivity of 95.9% and specificity of 41.7%; with a cutoff of 2.0, sensitivity was 92.6% and specificity was 59.9%. At the cutoff of 2.0, the NPVs at a prevalence of 1%, 10%, and 20% were 99.9%, 98.6%, and 97%, respectively. Conclusions A machine learning model developed with multicenter clinical data integrating commonly collected ED laboratory data demonstrated high rule-out accuracy for COVID-19 status, and might inform selective use of PCR-based testing.« less
  3. The COVID-19 pandemic demonstrated the public health benefits of reliable and accessible point-of-care (POC) diagnostic tests for viral infections. Despite the rapid development of gold-standard reverse transcription polymerase chain reaction (RT-PCR) assays for SARS-CoV-2 only weeks into the pandemic, global demand created logistical challenges that delayed access to testing for months and helped fuel the spread of COVID-19. Additionally, the extreme sensitivity of RT-PCR had a costly downside as the tests could not differentiate between patients with active infection and those who were no longer infectious but still shedding viral genomes. To address these issues for the future, we propose a novel membrane-based sensor that only detects intact virions. The sensor combines affinity and size based detection on a membrane-based sensor and does not require external power to operate or read. Specifically, the presence of intact virions, but not viral debris, fouls the membrane and triggers a macroscopically visible hydraulic switch after injection of a 40 μL sample with a pipette. The device, which we call the μSiM-DX (microfluidic device featuring a silicon membrane for diagnostics), features a biotin-coated microslit membrane with pores ∼2–3× larger than the intact virus. Streptavidin-conjugated antibody recognizing viral surface proteins are incubated with the samplemore »for ∼1 hour prior to injection into the device, and positive/negative results are obtained within ten seconds of sample injection. Proof-of-principle tests have been performed using preparations of vaccinia virus. After optimizing slit pore sizes and porous membrane area, the fouling-based sensor exhibits 100% specificity and 97% sensitivity for vaccinia virus ( n = 62). Moreover, the dynamic range of the sensor extends at least from 10 5.9 virions per mL to 10 10.4 virions per mL covering the range of mean viral loads in symptomatic COVID-19 patients (10 5.6 –10 7 RNA copies per mL). Forthcoming work will test the ability of our sensor to perform similarly in biological fluids and with SARS-CoV-2, to fully test the potential of a membrane fouling-based sensor to serve as a PCR-free alternative for POC containment efforts in the spread of infectious disease.« less
  4. Abstract Background

    SARS-CoV-2 is an RNA virus responsible for the coronavirus disease 2019 (COVID-19) pandemic. Viruses exist in complex microbial environments, and recent studies have revealed both synergistic and antagonistic effects of specific bacterial taxa on viral prevalence and infectivity. We set out to test whether specific bacterial communities predict SARS-CoV-2 occurrence in a hospital setting.


    We collected 972 samples from hospitalized patients with COVID-19, their health care providers, and hospital surfaces before, during, and after admission. We screened for SARS-CoV-2 using RT-qPCR, characterized microbial communities using 16S rRNA gene amplicon sequencing, and used these bacterial profiles to classify SARS-CoV-2 RNA detection with a random forest model.


    Sixteen percent of surfaces from COVID-19 patient rooms had detectable SARS-CoV-2 RNA, although infectivity was not assessed. The highest prevalence was in floor samples next to patient beds (39%) and directly outside their rooms (29%). Although bed rail samples more closely resembled the patient microbiome compared to floor samples, SARS-CoV-2 RNA was detected less often in bed rail samples (11%). SARS-CoV-2 positive samples had higher bacterial phylogenetic diversity in both human and surface samples and higher biomass in floor samples. 16S microbial community profiles enabled high classifier accuracy for SARS-CoV-2 status in not onlymore »nares, but also forehead, stool, and floor samples. Across these distinct microbial profiles, a single amplicon sequence variant from the genusRothiastrongly predicted SARS-CoV-2 presence across sample types, with greater prevalence in positive surface and human samples, even when compared to samples from patients in other intensive care units prior to the COVID-19 pandemic.


    These results contextualize the vast diversity of microbial niches where SARS-CoV-2 RNA is detected and identify specific bacterial taxa that associate with the viral RNA prevalence both in the host and hospital environment.

    « less
  5. Calderaro, Adriana (Ed.)
    The World Health Organization (WHO) declared coronavirus disease-2019 (COVID-19) a global pandemic on 11 March 2020. In Ecuador, the first case of COVID-19 was recorded on 29 February 2020. Despite efforts to control its spread, SARS-CoV-2 overran the Ecuadorian public health system, which became one of the most affected in Latin America on 24 April 2020. The Hospital General del Sur de Quito (HGSQ) had to transition from a general to a specific COVID-19 health center in a short period of time to fulfill the health demand from patients with respiratory afflictions. Here, we summarized the implementations applied in the HGSQ to become a COVID-19 exclusive hospital, including the rearrangement of hospital rooms and a triage strategy based on a severity score calculated through an artificial intelligence (AI)-assisted chest computed tomography (CT). Moreover, we present clinical, epidemiological, and laboratory data from 75 laboratory tested COVID-19 patients, which represent the first outbreak of Quito city. The majority of patients were male with a median age of 50 years. We found differences in laboratory parameters between intensive care unit (ICU) and non-ICU cases considering C-reactive protein, lactate dehydrogenase, and lymphocytes. Sensitivity and specificity of the AI-assisted chest CT were 21.4% and 66.7%,more »respectively, when considering a score >70%; regardless, this system became a cornerstone of hospital triage due to the lack of RT-PCR testing and timely results. If health workers act as vectors of SARS-CoV-2 at their domiciles, they can seed outbreaks that might put 1,879,047 people at risk of infection within 15 km around the hospital. Despite our limited sample size, the information presented can be used as a local example that might aid future responses in low and middle-income countries facing respiratory transmitted epidemics.« less