skip to main content


Title: A Machine Learning Study of COVID-19 Serology and Molecular Tests and Predictions
Serology and molecular tests are the two most commonly used methods for rapid COVID-19 infection testing. The two types of tests have different mechanisms to detect infection, by measuring the presence of viral SARS-CoV-2 RNA (molecular test) or detecting the presence of antibodies triggered by the SARS-CoV-2 virus (serology test). A handful of studies have shown that symptoms, combined with demographic and/or diagnosis features, can be helpful for the prediction of COVID-19 test outcomes. However, due to nature of the test, serology and molecular tests vary significantly. There is no existing study on the correlation between serology and molecular tests, and what type of symptoms are the key factors indicating the COVID-19 positive tests. In this study, we propose a machine learning based approach to study serology and molecular tests, and use features to predict test outcomes. A total of 2,467 donors, each tested using one or multiple types of COVID-19 tests, are collected as our testbed. By cross checking test types and results, we study correlation between serology and molecular tests. For test outcome prediction, we label 2,467 donors as positive or negative, by using their serology or molecular test results, and create symptom features to represent each donor for learning. Because COVID-19 produces a wide range of symptoms and the data collection process is essentially error prone, we group similar symptoms into bins. This decreases the feature space and sparsity. Using binned symptoms, combined with demographic features, we train five classification algorithms to predict COVID-19 test results. Experiments show that XGBoost achieves the best performance with 76.85% accuracy and 81.4% AUC scores, demonstrating that symptoms are indeed helpful for predicting COVID-19 test outcomes. Our study investigates the relationship between serology and molecular tests, identifies meaningful symptom features associated with COVID-19 infection, and also provides a way for rapid screening and cost effective detection of COVID-19 infection.  more » « less
Award ID(s):
2027339 1763452
NSF-PAR ID:
10357379
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Smart health
ISSN:
2352-6483
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Rapid testing is essential to fighting pandemics such as coronavirus disease 2019 (COVID-19), the disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Exhaled human breath contains multiple volatile molecules providing powerful potential for non-invasive diagnosis of diverse medical conditions. We investigated breath detection of SARS-CoV-2 infection using cavity-enhanced direct frequency comb spectroscopy (CE-DFCS), a state-of-the-art laser spectroscopic technique capable of a real-time massive collection of broadband molecular absorption features at ro-vibrational quantum state resolution and at parts-per-trillion volume detection sensitivity. Using a total of 170 individual breath samples (83 positive and 87 negative with SARS-CoV-2 based on reverse transcription polymerase chain reaction tests), we report excellent discrimination capability for SARS-CoV-2 infection with an area under the receiver-operating-characteristics curve of 0.849(4). Our results support the development of CE-DFCS as an alternative, rapid, non-invasive test for COVID-19 and highlight its remarkable potential for optical diagnoses of diverse biological conditions and disease states. 
    more » « less
  2. The COVID-19 pandemic demonstrated the public health benefits of reliable and accessible point-of-care (POC) diagnostic tests for viral infections. Despite the rapid development of gold-standard reverse transcription polymerase chain reaction (RT-PCR) assays for SARS-CoV-2 only weeks into the pandemic, global demand created logistical challenges that delayed access to testing for months and helped fuel the spread of COVID-19. Additionally, the extreme sensitivity of RT-PCR had a costly downside as the tests could not differentiate between patients with active infection and those who were no longer infectious but still shedding viral genomes. To address these issues for the future, we propose a novel membrane-based sensor that only detects intact virions. The sensor combines affinity and size based detection on a membrane-based sensor and does not require external power to operate or read. Specifically, the presence of intact virions, but not viral debris, fouls the membrane and triggers a macroscopically visible hydraulic switch after injection of a 40 μL sample with a pipette. The device, which we call the μSiM-DX (microfluidic device featuring a silicon membrane for diagnostics), features a biotin-coated microslit membrane with pores ∼2–3× larger than the intact virus. Streptavidin-conjugated antibody recognizing viral surface proteins are incubated with the sample for ∼1 hour prior to injection into the device, and positive/negative results are obtained within ten seconds of sample injection. Proof-of-principle tests have been performed using preparations of vaccinia virus. After optimizing slit pore sizes and porous membrane area, the fouling-based sensor exhibits 100% specificity and 97% sensitivity for vaccinia virus ( n = 62). Moreover, the dynamic range of the sensor extends at least from 10 5.9 virions per mL to 10 10.4 virions per mL covering the range of mean viral loads in symptomatic COVID-19 patients (10 5.6 –10 7 RNA copies per mL). Forthcoming work will test the ability of our sensor to perform similarly in biological fluids and with SARS-CoV-2, to fully test the potential of a membrane fouling-based sensor to serve as a PCR-free alternative for POC containment efforts in the spread of infectious disease. 
    more » « less
  3. Abstract Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) the causal agent for COVID-19, is a communicable disease spread through close contact. It is known to disproportionately impact certain communities due to both biological susceptibility and inequitable exposure. In this study, we investigate the most important health, social, and environmental factors impacting the early phases (before July, 2020) of per capita COVID-19 transmission and per capita all-cause mortality in US counties. We aggregate county-level physical and mental health, environmental pollution, access to health care, demographic characteristics, vulnerable population scores, and other epidemiological data to create a large feature set to analyze per capita COVID-19 outcomes. Because of the high-dimensionality, multicollinearity, and unknown interactions of the data, we use ensemble machine learning and marginal prediction methods to identify the most salient factors associated with several COVID-19 outbreak measure. Our variable importance results show that measures of ethnicity, public transportation and preventable diseases are the strongest predictors for both per capita COVID-19 incidence and mortality. Specifically, the CDC measures for minority populations, CDC measures for limited English, and proportion of Black- and/or African-American individuals in a county were the most important features for per capita COVID-19 cases within a month after the pandemic started in a county and also at the latest date examined. For per capita all-cause mortality at day 100 and total to date, we find that public transportation use and proportion of Black- and/or African-American individuals in a county are the strongest predictors. The methods predict that, keeping all other factors fixed, a 10% increase in public transportation use, all other factors remaining fixed at the observed values, is associated with increases mortality at day 100 of 2012 individuals (95% CI [1972, 2356]) and likewise a 10% increase in the proportion of Black- and/or African-American individuals in a county is associated with increases total deaths at end of study of 2067 (95% CI [1189, 2654]). Using data until the end of study, the same metric suggests ethnicity has double the association as the next most important factors, which are location, disease prevalence, and transit factors. Our findings shed light on societal patterns that have been reported and experienced in the U.S. by using robust methods to understand the features most responsible for transmission and sectors of society most vulnerable to infection and mortality. In particular, our results provide evidence of the disproportionate impact of the COVID-19 pandemic on minority populations. Our results suggest that mitigation measures, including how vaccines are distributed, could have the greatest impact if they are given with priority to the highest risk communities. 
    more » « less
  4. Population-scale and rapid testing for SARS-CoV-2 continues to be a priority for several parts of the world. We revisit the in vitro technology platforms for COVID-19 testing and diagnostics—molecular tests and rapid antigen tests, serology or antibody tests, and tests for the management of COVID-19 patients. Within each category of tests, we review the commercialized testing platforms, their analyzing systems, specimen collection protocols, testing methodologies, supply chain logistics, and related attributes. Our discussion is essentially focused on test products that have been granted emergency use authorization by the FDA to detect and diagnose COVID-19 infections. Different strategies for scaled-up and faster screening are covered here, such as pooled testing, screening programs, and surveillance testing. The near-term challenges lie in detecting subtle infectivity profiles, mapping the transmission dynamics of new variants, lowering the cost for testing, training a large healthcare workforce, and providing test kits for the masses. Through this review, we try to understand the feasibility of universal access to COVID-19 testing and diagnostics in the near future while being cognizant of the implicit tradeoffs during the development and distribution cycles of new testing platforms. 
    more » « less
  5. In the context of continued spread of coronavirus disease 2019 (COVID-19) caused by SARS-CoV-2 and the emergence of new variants, the demand for rapid, accurate, and frequent detection is increasing. Moreover, the new predominant strain, Omicron variant, manifests more similar clinical features to those of other common respiratory infections. The concurrent detection of multiple potential pathogens helps distinguish SARS-CoV-2 infection from other diseases with overlapping symptoms, which is significant for providing tailored treatment to patients and containing the outbreak. Here, we report a lab-on-a-chip biosensing platform for SARS-CoV-2 detection based on the subwavelength grating micro-ring resonator. The sensing surface is functionalized by specific antibody against SARS-CoV-2 spike protein, which could produce redshifts of resonant peaks by antigen–antibody combination, thus achieving quantitative detection. Additionally, the sensor chip is integrated with a microfluidic chip featuring an anti-backflow Y-shaped structure that enables the concurrent detection of two analytes. In this study, we realized the detection and differentiation of COVID-19 and influenza A H1N1. Experimental results indicate that the limit of detection of our device reaches 100 fg/ml (1.31 fM) within 15 min detecting time, and cross-reactivity tests manifest the specificity of the optical diagnostic assay. Furthermore, the integrated packaging and streamlined workflow facilitate its use for clinical applications. Thus, the biosensing platform presents a promising approach for attaining highly sensitive, selective, multiplexed, and quantitative point-of-care diagnosis and distinction between COVID-19 and influenza.

     
    more » « less