skip to main content


Title: From mixed effects modeling to spike and slab variable selection: A Bayesian regression model for group testing data
Abstract

Due to reductions in both time and cost, group testing is a popular alternative to individual‐level testing for disease screening. These reductions are obtained by testing pooled biospecimens (eg, blood, urine, swabs, etc.) for the presence of an infectious agent. However, these reductions come at the expense of data complexity, making the task of conducting disease surveillance more tenuous when compared to using individual‐level data. This is because an individual's disease status may be obscured by a group testing protocol and the effect of imperfect testing. Furthermore, unlike individual‐level testing, a given participant could be involved in multiple testing outcomes and/or may never be tested individually. To circumvent these complexities and to incorporate all available information, we propose a Bayesian generalized linear mixed model that accommodates data arising from any group testing protocol, estimates unknown assay accuracy probabilities and accounts for potential heterogeneity in the covariate effects across population subgroups (eg, clinic sites, etc.); this latter feature is of key interest to practitioners tasked with conducting disease surveillance. To achieve model selection, our proposal uses spike and slab priors for both fixed and random effects. The methodology is illustrated through numerical studies and is applied to chlamydia surveillance data collected in Iowa.

 
more » « less
Award ID(s):
1826715 1633608
NSF-PAR ID:
10457066
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Biometrics
Volume:
76
Issue:
3
ISSN:
0006-341X
Format(s):
Medium: X Size: p. 913-923
Size(s):
p. 913-923
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary In screening applications involving low-prevalence diseases, pooling specimens (e.g., urine, blood, swabs, etc.) through group testing can be far more cost effective than testing specimens individually. Estimation is a common goal in such applications and typically involves modeling the probability of disease as a function of available covariates. In recent years, several authors have developed regression methods to accommodate the complex structure of group testing data but often under the assumption that covariate effects are linear. Although linearity is a reasonable assumption in some applications, it can lead to model misspecification and biased inference in others. To offer a more flexible framework, we propose a Bayesian generalized additive regression approach to model the individual-level probability of disease with potentially misclassified group testing data. Our approach can be used to analyze data arising from any group testing protocol with the goal of estimating multiple unknown smooth functions of covariates, standard linear effects for other covariates, and assay classification accuracy probabilities. We illustrate the methods in this article using group testing data on chlamydia infection in Iowa. 
    more » « less
  2. When screening for infectious diseases, group testing has proven to be a cost efficient alternative to individual level testing. Cost savings are realized by testing pools of individual specimens (eg, blood, urine, saliva, and so on) rather than by testing the specimens separately. However, a common concern that arises in group testing is the so‐called “dilution effect.” This occurs if the signal from a positive individual's specimen is diluted past an assay's threshold of detection when it is pooled with multiple negative specimens. In this article, we propose a new statistical framework for group testing data that merges estimation and case identification, which are often treated separately in the literature. Our approach considers analyzing continuous biomarker levels (eg, antibody levels, antigen concentrations, and so on) from pooled samples to estimate both a binary regression model for the probability of disease and the biomarker distributions for cases and controls. To increase case identification accuracy, we then show how estimates of the biomarker distributions can be used to select diagnostic thresholds on a pool‐by‐pool basis. Our proposals are evaluated through numerical studies and are illustrated using hepatitis B virus data collected on a prison population in Ireland.

     
    more » « less
  3. null (Ed.)
    Summary Group testing involves pooling individual specimens (e.g., blood, urine, swabs, etc.) and testing the pools for the presence of disease. When the proportion of diseased individuals is small, group testing can greatly reduce the number of tests needed to screen a population. Statistical research in group testing has traditionally focused on applications for a single disease. However, blood service organizations and large-scale disease surveillance programs are increasingly moving towards the use of multiplex assays, which measure multiple disease biomarkers at once. Tebbs and others (2013, Two-stage hierarchical group testing for multiple infections with application to the Infertility Prevention Project. Biometrics69, 1064–1073) and Hou and others (2017, Hierarchical group testing for multiple infections. Biometrics73, 656–665) were the first to examine hierarchical group testing case identification procedures for multiple diseases. In this article, we propose new non-hierarchical procedures which utilize two-dimensional arrays. We derive closed-form expressions for the expected number of tests per individual and classification accuracy probabilities and show that array testing can be more efficient than hierarchical procedures when screening individuals for multiple diseases at once. We illustrate the potential of using array testing in the detection of chlamydia and gonorrhea for a statewide screening program in Iowa. Finally, we describe an R/Shiny application that will help practitioners identify the best multiple-disease case identification algorithm. 
    more » « less
  4. Summary

    A Bayesian framework for group testing under dilution effects has been developed, using lattice-based models. This work has particular relevance given the pressing public health need to enhance testing capacity for coronavirus disease 2019 and future pandemics, and the need for wide-scale and repeated testing for surveillance under constantly varying conditions. The proposed Bayesian approach allows for dilution effects in group testing and for general test response distributions beyond just binary outcomes. It is shown that even under strong dilution effects, an intuitive group testing selection rule that relies on the model order structure, referred to as the Bayesian halving algorithm, has attractive optimal convergence properties. Analogous look-ahead rules that can reduce the number of stages in classification by selecting several pooled tests at a time are proposed and evaluated as well. Group testing is demonstrated to provide great savings over individual testing in the number of tests needed, even for moderately high prevalence levels. However, there is a trade-off with higher number of testing stages, and increased variability. A web-based calculator is introduced to assist in weighing these factors and to guide decisions on when and how to pool under various conditions. High-performance distributed computing methods have also been implemented for considering larger pool sizes, when savings from group testing can be even more dramatic.

     
    more » « less
  5. Future tactical communications involves high data rate best effort traffic working alongside real-time traffic for time-critical applications with hard deadlines. Unavailable bandwidth and/or untimely responses may lead to undesired or even catastrophic outcomes. Ethernet-based communication systems are one of the major tactical network standards due to the higher bandwidth, better utilization, and ability to handle heterogeneous traffic. However, Ethernet suffers from inconsistent performance for jitter, latency and bandwidth under heavy loads. The emerging Time-Triggered Ethernet (TTE) solutions promise deterministic Ethernet performance, fault-tolerant topologies and real-time guarantees for critical traffic. In this paper we study the TTE protocol and build a TTTech TTE test bed to evaluate its performance. Through experimental study, the TTE protocol was observed to provide consistent high data rates for best effort messages, determinism with very low jitter for time-triggered messages, and fault-tolerance for minimal packet loss using redundant networking topologies. In addition, challenges were observed that presented a trade-off between the integration cycle and the synchronization overhead. It is concluded that TTE is a capable solution to support heterogeneous traffic in time-critical applications, such as aerospace systems (eg. airplanes, spacecraft, etc.), ground-based vehicles (eg. trains, buses, cars, etc), and cyber-physical systems (eg. smart-grids, IoT, etc.). 
    more » « less