skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Use of compressed sensing to expedite high-throughput diagnostic testing for COVID-19 and beyond
The rapid spread of SARS-CoV-2 has placed a significant burden on public health systems to provide swift and accurate diagnostic testing highlighting the critical need for innovative testing approaches for future pandemics. In this study, we present a novel sample pooling procedure based on compressed sensing theory to accurately identify virally infected patients at high prevalence rates utilizing an innovative viral RNA extraction process to minimize sample dilution. At prevalence rates ranging from 0–14.3%, the number of tests required to identify the infection status of all patients was reduced by 69.26% as compared to conventional testing in primary human SARS-CoV-2 nasopharyngeal swabs and a coronavirus model system. Our method provided quantification of individual sample viral load within a pool as well as a binary positive-negative result. Additionally, our modified pooling and RNA extraction process minimized sample dilution which remained constant as pool sizes increased. Compressed sensing can be adapted to a wide variety of diagnostic testing applications to increase throughput for routine laboratory testing as well as a means to increase testing capacity to combat future pandemics.  more » « less
Award ID(s):
2133205
PAR ID:
10464957
Author(s) / Creator(s):
; ; ; ; ; ;
Editor(s):
Faeder, James R.
Date Published:
Journal Name:
PLOS Computational Biology
Volume:
18
Issue:
10
ISSN:
1553-7358
Page Range / eLocation ID:
e1010629
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Dai, Tianhong; Wu, Mei X.; Popp, Jürgen (Ed.)
    The SARS-CoV-2 pandemic has revealed the need for rapid and inexpensive diagnostic testing to enable population-based screening for active infection. Neither standard diagnostic testing, the detection and measurement of viral RNA (via polymerase chain reaction), or serological testing (via enzyme-linked immunosorbent assay) has the capability to definitively determine active infection. The former due to a lack of ability to distinguish between replicable and inert viral RNA, and the latter due to varying immune responses (ranging from latent to a complete lack of immune response altogether). Despite many companies producing rapid point-of-care (POC) tests, none will address the global scale of testing needed and few help to combat the ever growing issue of testing resource scarcity. Here we discuss our efforts towards the development of a highly manufacturable, microfluidic device that instantly indicates active viral infection status from ~ 20 μL of nasal mucus or phlegm and requires no external power. The device features a biotin functionalized silicon nanomembrane within an acrylic body containing channels and ports for sample introduction and analysis. Virus capture and target confirmation are done using affinity-based capture and size-based occlusion respectively. Modularity of the device is proven with bead and vaccinia virus capture as we work towards testing with both pure SARS-CoV-2 virus and human samples. With success on all fronts, we could achieve an inexpensive POC diagnostic which can determine an individual’s infection status, aiding containment efforts in the current and future pandemics. In addition to direct viral detection, our method can be used as a rapid POC sample preparation tool that limits the application of PCR reagents to those samples which already display viral size and antigen-based positivity through our device. 
    more » « less
  2. null (Ed.)
    Abstract Background Accurate diagnostic strategies to identify SARS-CoV-2 positive individuals rapidly for management of patient care and protection of health care personnel are urgently needed. The predominant diagnostic test is viral RNA detection by RT-PCR from nasopharyngeal swabs specimens, however the results are not promptly obtainable in all patient care locations. Routine laboratory testing, in contrast, is readily available with a turn-around time (TAT) usually within 1-2 hours. Method We developed a machine learning model incorporating patient demographic features (age, sex, race) with 27 routine laboratory tests to predict an individual’s SARS-CoV-2 infection status. Laboratory testing results obtained within 2 days before the release of SARS-CoV-2 RT-PCR result were used to train a gradient boosting decision tree (GBDT) model from 3,356 SARS-CoV-2 RT-PCR tested patients (1,402 positive and 1,954 negative) evaluated at a metropolitan hospital. Results The model achieved an area under the receiver operating characteristic curve (AUC) of 0.854 (95% CI: 0.829-0.878). Application of this model to an independent patient dataset from a separate hospital resulted in a comparable AUC (0.838), validating the generalization of its use. Moreover, our model predicted initial SARS-CoV-2 RT-PCR positivity in 66% individuals whose RT-PCR result changed from negative to positive within 2 days. Conclusion This model employing routine laboratory test results offers opportunities for early and rapid identification of high-risk SARS-CoV-2 infected patients before their RT-PCR results are available. It may play an important role in assisting the identification of SARS-CoV-2 infected patients in areas where RT-PCR testing is not accessible due to financial or supply constraints. 
    more » « less
  3. Abstract The emergence of SARS-CoV-2 highlights a need for evidence-based strategies to monitor bat viruses. We performed a systematic review of coronavirus sampling (testing for RNA positivity) in bats globally. We identified 110 studies published between 2005 and 2020 that collectively reported positivity from 89,752 bat samples. We compiled 2,274 records of infection prevalence at the finest methodological, spatiotemporal and phylogenetic level of detail possible from public records into an open, static database named datacov, together with metadata on sampling and diagnostic methods. We found substantial heterogeneity in viral prevalence across studies, reflecting spatiotemporal variation in viral dynamics and methodological differences. Meta-analysis identified sample type and sampling design as the best predictors of prevalence, with virus detection maximized in rectal and faecal samples and by repeat sampling of the same site. Fewer than one in five studies collected and reported longitudinal data, and euthanasia did not improve virus detection. We show that bat sampling before the SARS-CoV-2 pandemic was concentrated in China, with research gaps in South Asia, the Americas and sub-Saharan Africa, and in subfamilies of phyllostomid bats. We propose that surveillance strategies should address these gaps to improve global health security and enable the origins of zoonotic coronaviruses to be identified. 
    more » « less
  4. Pe'er, I. (Ed.)
    Combinatorial group testing and compressed sensing both focus on recovering a sparse vector of dimensionality n from a much smaller number 𝑚<𝑛 of measurements. In the first approach, the problem is defined over the Boolean field – the goal is to recover a Boolean vector and measurements are Boolean; in the second approach, the unknown vector and the measurements are over the reals. Here, we focus on real-valued group testing setting that more closely fits modern testing protocols relying on quantitative measurements, such as qPCR, where the goal is recovery of a sparse, Boolean vector and the pooling matrix needs to be Boolean and sparse, but the unknown input signal vector and the measurement outcomes are nonnegative reals, and the matrix algebra implied in the test protocol is over the reals. With the recent renewed interest in group testing, focus has been on quantitative measurements resulting from qPCR, but the method proposed for sample pooling were based on matrices designed with Boolean measurements in mind. Here, we investigate constructing pooling matrices dedicated for the real-valued group testing. We provide conditions for pooling matrices to guarantee unambiguous decoding of positives in this setting. We also show a deterministic algorithm for constructing matrices meeting the proposed condition, for small matrix sizes that can be implemented using a laboratory robot. Using simulated data, we show that the proposed approach leads to matrices that can be applied for higher positivity rates than combinatorial group testing matrices considered for viral testing previously. We also validate the approach through wet lab experiments involving SARS-CoV-2 nasopharyngeal swab samples. 
    more » « less
  5. Background: recent applications of wastewater-based epidemiology (WBE) have demonstrated its ability to track the spread and dynamics of COVID-19 at the community level. Despite the growing body of research, quantitative synthesis of SARS-CoV-2 RNA levels in wastewater generated from studies across space and time using diverse methods has not been performed. Objective: the objective of this study is to examine the correlations between SARS-CoV-2 RNA levels in wastewater and epidemiological indicators across studies, stratified by key covariates in study methodologies. In addition, we examined the association of proportions of positive detections in wastewater samples and methodological covariates. Methods: we systematically searched the Web of Science for studies published by February 16th, 2021, performed a reproducible screening, and employed mixed-effects models to estimate the levels of SARS-CoV-2 viral RNA quantities in wastewater samples and their correlations to the case prevalence, the sampling mode (grab or composite sampling), and the wastewater fraction analyzed ( i.e. , solids, solid–supernatant mixtures, or supernatants/filtrates). Results: a hundred and one studies were found; twenty studies (671 biosamples and 1751 observations) were retained following a reproducible screening. The mean positivity across all studies was 0.68 (95%-CI, [0.52; 0.85]). The mean viral RNA abundance was 5244 marker copies per mL (95%-CI, [0; 16 432]). The Pearson correlation coefficients between the viral RNA levels and case prevalence were 0.28 (95%-CI, [0.01; 0.51]) for daily new cases or 0.29 (95%-CI, [−0.15; 0.73]) for cumulative cases. The fraction analyzed accounted for 12.4% of the variability in the percentage of positive detections, followed by the case prevalence (9.3% by daily new cases and 5.9% by cumulative cases) and sampling mode (0.6%). Among observations with positive detections, the fraction analyzed accounted for 56.0% of the variability in viral RNA levels, followed by the sampling mode (6.9%) and case prevalence (0.9% by daily new cases and 0.8% by cumulative cases). While the sampling mode and fraction analyzed both significantly correlated with the SARS-CoV-2 viral RNA levels, the magnitude of the increase in positive detection associated with the fraction analyzed was larger. The mixed-effects model treating studies as random effects and case prevalence as fixed effects accounted for over 90% of the variability in SARS-CoV-2 positive detections and viral RNA levels. Interpretations: positive pooled means and confidence intervals in the Pearson correlation coefficients between the SARS-CoV-2 viral RNA levels and case prevalence indicators provide quantitative evidence that reinforces the value of wastewater-based monitoring of COVID-19. Large heterogeneities among studies in proportions of positive detections, viral RNA levels, and Pearson correlation coefficients suggest a strong demand for methods to generate data accounting for cross-study heterogeneities and more detailed metadata reporting. Large variance was explained by the fraction analyzed, suggesting sample pre-processing and fractionation as a direction that needs to be prioritized in method standardization. Mixed-effects models accounting for study level variations provide a new perspective to synthesize data from multiple studies. 
    more » « less