skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Correcting prevalence estimation for biased sampling with testing errors
Sampling for prevalence estimation of infection is subject to bias by both over- sampling of symptomatic individuals and error-prone tests. This results in naïve estimators of prevalence (ie, proportion of observed infected individuals in the sample) that can be very far from the true proportion of infected. In this work, we present a method of prevalence estimation that reduces both the effect of bias due to testing errors and oversampling of symptomatic individuals, eliminat- ing it altogether in some scenarios. Moreover, this procedure considers stratified errors in which tests have different error rate profiles for symptomatic and asymptomatic individuals. This results in easily implementable algorithms, for which code is provided, that produce better prevalence estimates than other methods (in terms of reducing and/or removing bias), as demonstrated by formal results, simulations, and on COVID-19 data from the Israeli Ministry of Health.  more » « less
Award ID(s):
2210208
PAR ID:
10529739
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Wiley
Date Published:
Journal Name:
Statistics in medicine
ISSN:
0277-6715
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Many data sources, including tracking social behav- ior to election polling to testing studies for understanding disease spread, are subject to sampling bias whose implications are not fully yet understood. In this paper we study estimation of a given feature (such as disease, or behavior at social media platforms) from biased samples, treating non-respondent individuals as missing data. Prevalence of the feature among sampled individuals has an upward bias under the assumption of individuals’ willingness to be sampled. This can be viewed as a regression model with symptoms as covariates and the feature as outcome. It is assumed that the outcome is unknown at the time of sampling, and therefore the missingness mechanism only depends on the covariates. We show that data, in spite of this, is missing at random only when the sizes of symptom classes in the population are known; otherwise data is missing not at random. With an information theoretic viewpoint, we show that sampling bias corresponds to external information due to individuals in the population knowing their covariates, and we quantify this external information by active information. The reduction in prevalence, when sampling bias is adjusted for, similarly translates into active information due to bias correction, with opposite sign to active information due to testing bias. We develop unified results that show that prevalence and active information estimates are asymptotically normal under all missing data mechanisms, when testing errors are absent and present respectively. The asymptotic behavior of the estimators is illustrated through simulations. 
    more » « less
  2. Elshall, Ahmed; Ye, Ming (Ed.)
    Bayesian model evidence (BME) is a measure of the average fit of a model to observation data given all the parameter values that the model can assume. By accounting for the trade-off between goodness-of-fit and model complexity, BME is used for model selection and model averaging purposes. For strict Bayesian computation, the theoretically unbiased Monte Carlo based numerical estimators are preferred over semi-analytical solutions. This study examines five BME numerical estimators and asks how accurate estimation of the BME is important for penalizing model complexity. The limiting cases for numerical BME estimators are the prior sampling arithmetic mean estimator (AM) and the posterior sampling harmonic mean (HM) estimator, which are straightforward to implement, yet they result in underestimation and overestimation, respectively. We also consider the path sampling methods of thermodynamic integration (TI) and steppingstone sampling (SS) that sample multiple intermediate distributions that link the prior and the posterior. Although TI and SS are theoretically unbiased estimators, they could have a bias in practice arising from numerical implementation. For example, sampling errors of some intermediate distributions can introduce bias. We propose a variant of SS, namely the multiple one-steppingstone sampling (MOSS) that is less sensitive to sampling errors. We evaluate these five estimators using a groundwater transport model selection problem. SS and MOSS give the least biased BME estimation at an efficient computational cost. If the estimated BME has a bias that covariates with the true BME, this would not be a problem because we are interested in BME ratios and not their absolute values. On the contrary, the results show that BME estimation bias can be a function of model complexity. Thus, biased BME estimation results in inaccurate penalization of more complex models, which changes the model ranking. This was less observed with SS and MOSS as with the three other methods. 
    more » « less
  3. ABSTRACT Predation can alter diverse ecological processes, including host–parasite interactions. Selective predation, whereby predators preferentially feed on certain prey types, can affect prey density and selective pressures. Studies on selective predation in infected populations have primarily focused on predators preferentially feeding on infected prey. However, there is substantial evidence that some predators preferentially consume uninfected individuals. Such different strategies of prey selectivity likely modulate host–parasite interactions, changing the fitness payoffs both for hosts and their parasites. Here we investigated the effects of different types of selective predation on infection dynamics and host evolution. We used a host–parasite system in the laboratory (Daphnia dentifera infected with the horizontally transmitted fungus,Metschnikowia bicuspidata) to artificially manipulate selective predation by removing infected, uninfected, or randomly selected prey over approximately 8–9 overlapping generations. We collected weekly data on population demographics and host infection and measured susceptibility from a subset of the remaining hosts in each population at the end of the experiment. After 6 weeks of selective predation pressure, we found no differences in host abundance or infection prevalence across predation treatments. Counterintuitively, populations with selective predation on infected individuals had a higher abundance of infected individuals than populations where either uninfected or randomly selected individuals were removed. Additionally, populations with selective predation for uninfected individuals had a higher proportion of individuals infected after a standardized exposure to the parasite than individuals from the two other predation treatments. These results suggest that selective predation can alter the abundance of infected hosts and host evolution. 
    more » « less
  4. Abstract Insect–pathogen dynamics can show seasonal and inter‐annual variations that covary with fluctuations in insect abundance and climate. Long‐term analyses are especially needed to track parasite dynamics in migratory insects, in part because their vast habitat ranges and high mobility might dampen local effects of density and climate on infection prevalence.Monarch butterfliesDanaus plexippusare commonly infected with the protozoanOphryocystis elektroscirrha(OE). Because this parasite lowers monarch survival and flight performance, and because migratory monarchs have experienced declines in recent decades, it is important to understand the patterns and drivers of infection.Here we compiled data onOEinfection spanning 50 years, from wild monarchs sampled in the United States, Canada and Mexico during summer breeding, fall migrating and overwintering periods. We examined eastern versus western North American monarchs separately, to ask how abundance estimates, resource availability, climate and breeding season length impact infection trends. We further assessed the intensity of migratory culling, which occurs when infected individuals are removed from the population during migration.Average infection prevalence was four times higher in western compared to eastern subpopulations. In eastern North America, the proportion of infected monarchs increased threefold since the mid‐2000s. In the western region, the proportion of infected monarchs declined sharply from 2000 to 2015, and increased thereafter. For both eastern and western subpopulations, years with greater summer adult abundance predicted greater infection prevalence, indicating that transmission increases with host breeding density. Environmental variables (temperature and NDVI) were not associated with changes in the proportion of infected adults. We found evidence for migratory culling of infected butterflies, based on declines in parasitism during fall migration. We estimated that tens of millions fewer monarchs reach overwintering sites in Mexico as a result ofOE, highlighting the need to consider the parasite as a potential threat to the monarch population.Increases in infection among eastern North American monarchs post‐2002 suggest that changes to the host’s ecology or environment have intensified parasite transmission. Further work is needed to examine the degree to which human practices, such as mass caterpillar rearing and the widespread planting of exotic milkweed, have contributed to this trend. 
    more » « less
  5. In this paper, we use modified versions of the SIAR model for epidemics to propose two ways of understanding and quantifying the effect of non-compliance to non-pharmaceutical intervention measures on the spread of an infectious disease. The SIAR model distinguishes between symptomatic infected (I) and asymptomatic infected (A) populations. One modification, which is simpler, assumes a known proportion of the population does not comply with government mandates such as quarantining and social-distancing. In a more sophisticated approach, the modified model treats non-compliant behavior as a social contagion. We theoretically explore different scenarios such as the occurrence of multiple waves of infections. Local and asymptotic analyses for both models are also provided. 
    more » « less