A Bayesian framework for group testing under dilution effects has been developed, using lattice-based models. This work has particular relevance given the pressing public health need to enhance testing capacity for coronavirus disease 2019 and future pandemics, and the need for wide-scale and repeated testing for surveillance under constantly varying conditions. The proposed Bayesian approach allows for dilution effects in group testing and for general test response distributions beyond just binary outcomes. It is shown that even under strong dilution effects, an intuitive group testing selection rule that relies on the model order structure, referred to as the Bayesian halving algorithm, has attractive optimal convergence properties. Analogous look-ahead rules that can reduce the number of stages in classification by selecting several pooled tests at a time are proposed and evaluated as well. Group testing is demonstrated to provide great savings over individual testing in the number of tests needed, even for moderately high prevalence levels. However, there is a trade-off with higher number of testing stages, and increased variability. A web-based calculator is introduced to assist in weighing these factors and to guide decisions on when and how to pool under various conditions. High-performance distributed computing methods have also been implemented for considering larger pool sizes, when savings from group testing can be even more dramatic.
When screening for infectious diseases, group testing has proven to be a cost efficient alternative to individual level testing. Cost savings are realized by testing pools of individual specimens (eg, blood, urine, saliva, and so on) rather than by testing the specimens separately. However, a common concern that arises in group testing is the so‐called “dilution effect.” This occurs if the signal from a positive individual's specimen is diluted past an assay's threshold of detection when it is pooled with multiple negative specimens. In this article, we propose a new statistical framework for group testing data that merges estimation and case identification, which are often treated separately in the literature. Our approach considers analyzing continuous biomarker levels (eg, antibody levels, antigen concentrations, and so on) from pooled samples to estimate both a binary regression model for the probability of disease and the biomarker distributions for cases and controls. To increase case identification accuracy, we then show how estimates of the biomarker distributions can be used to select diagnostic thresholds on a pool‐by‐pool basis. Our proposals are evaluated through numerical studies and are illustrated using hepatitis B virus data collected on a prison population in Ireland.
more » « less- NSF-PAR ID:
- 10370007
- Publisher / Repository:
- Wiley Blackwell (John Wiley & Sons)
- Date Published:
- Journal Name:
- Statistics in Medicine
- Volume:
- 40
- Issue:
- 11
- ISSN:
- 0277-6715
- Page Range / eLocation ID:
- p. 2540-2555
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Summary -
Abstract Due to reductions in both time and cost, group testing is a popular alternative to individual‐level testing for disease screening. These reductions are obtained by testing pooled biospecimens (eg, blood, urine, swabs, etc.) for the presence of an infectious agent. However, these reductions come at the expense of data complexity, making the task of conducting disease surveillance more tenuous when compared to using individual‐level data. This is because an individual's disease status may be obscured by a group testing protocol and the effect of imperfect testing. Furthermore, unlike individual‐level testing, a given participant could be involved in multiple testing outcomes and/or may never be tested individually. To circumvent these complexities and to incorporate all available information, we propose a Bayesian generalized linear mixed model that accommodates data arising from any group testing protocol, estimates unknown assay accuracy probabilities and accounts for potential heterogeneity in the covariate effects across population subgroups (eg, clinic sites, etc.); this latter feature is of key interest to practitioners tasked with conducting disease surveillance. To achieve model selection, our proposal uses spike and slab priors for both fixed and random effects. The methodology is illustrated through numerical studies and is applied to chlamydia surveillance data collected in Iowa.
-
null (Ed.)Summary Group testing involves pooling individual specimens (e.g., blood, urine, swabs, etc.) and testing the pools for the presence of disease. When the proportion of diseased individuals is small, group testing can greatly reduce the number of tests needed to screen a population. Statistical research in group testing has traditionally focused on applications for a single disease. However, blood service organizations and large-scale disease surveillance programs are increasingly moving towards the use of multiplex assays, which measure multiple disease biomarkers at once. Tebbs and others (2013, Two-stage hierarchical group testing for multiple infections with application to the Infertility Prevention Project. Biometrics69, 1064–1073) and Hou and others (2017, Hierarchical group testing for multiple infections. Biometrics73, 656–665) were the first to examine hierarchical group testing case identification procedures for multiple diseases. In this article, we propose new non-hierarchical procedures which utilize two-dimensional arrays. We derive closed-form expressions for the expected number of tests per individual and classification accuracy probabilities and show that array testing can be more efficient than hierarchical procedures when screening individuals for multiple diseases at once. We illustrate the potential of using array testing in the detection of chlamydia and gonorrhea for a statewide screening program in Iowa. Finally, we describe an R/Shiny application that will help practitioners identify the best multiple-disease case identification algorithm.more » « less
-
Newborn screening (NBS) is a state-level initiative that detects life-threatening genetic disorders for which early treatment can substantially improve health outcomes. Cystic fibrosis (CF) is among the most prevalent disorders in NBS. CF can be caused by a large number of mutation variants to the CFTR gene. Most states use a multitest CF screening process that includes a genetic test (DNA). However, due to cost concerns, DNA is used only on a small subset of newborns (based on a low-cost biomarker test with low classification accuracy), and only for a small subset of CF-causing variants. To overcome the cost barriers of expanded genetic testing, we explore a novel approach, of multipanel pooled DNA testing. This approach leads not only to a novel optimization problem (variant selection for screening, variant partition into multipanels, and pool size determination for each panel), but also to novel CF NBS processes. We establish key structural properties of optimal multipanel pooled DNA designs; develop a methodology that generates a family of optimal designs at different costs; and characterize the conditions under which a 1-panel versus a multipanel design is optimal. This methodology can assist decision-makers to design a screening process, considering the cost versus accuracy trade-off. Our case study, based on published CF NBS data from the state of New York, indicates that the multipanel and pooling aspects of genetic testing work synergistically, and the proposed NBS processes have the potential to substantially improve both the efficiency and accuracy of current practices. This paper was accepted by Stefan Scholtes, healthcare management.more » « less
-
Pantea, Casian (Ed.)Limited testing capacity for COVID-19 has hampered the pandemic response. Pooling is a testing method wherein samples from specimens (e.g., swabs) from multiple subjects are combined into a pool and screened with a single test. If the pool tests positive, then new samples from the collected specimens are individually tested, while if the pool tests negative, the subjects are classified as negative for the disease. Pooling can substantially expand COVID-19 testing capacity and throughput, without requiring additional resources. We develop a mathematical model to determine the best pool size for different risk groups , based on each group’s estimated COVID-19 prevalence. Our approach takes into consideration the sensitivity and specificity of the test, and a dynamic and uncertain prevalence, and provides a robust pool size for each group. For practical relevance, we also develop a companion COVID-19 pooling design tool (through a spread sheet). To demonstrate the potential value of pooling, we study COVID-19 screening using testing data from Iceland for the period, February-28-2020 to June-14-2020, for subjects stratified into high- and low-risk groups. We implement the robust pooling strategy within a sequential framework, which updates pool sizes each week, for each risk group, based on prior week’s testing data. Robust pooling reduces the number of tests, over individual testing, by 88.5% to 90.2%, and 54.2% to 61.9%, respectively, for the low-risk and high-risk groups (based on test sensitivity values in the range [0.71, 0.98] as reported in the literature). This results in much shorter times, on average, to get the test results compared to individual testing (due to the higher testing throughput), and also allows for expanded screening to cover more individuals. Thus, robust pooling can potentially be a valuable strategy for COVID-19 screening.more » « less