skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on August 12, 2026

Title: Advancing infection profiling under data uncertainty through contagion potential
During the COVID-19 pandemic, the prevalence of asymptomatic cases challenged the reliability of epidemiological statistics in policymaking. To address this, we introducedcontagion potential(CP) as a continuous metric derived from sociodemographic and epidemiological data to quantify the infection risk posed by the asymptomatic within a region. However, CP estimation is hindered by incomplete or biased incidence data, where underreporting and testing constraints make direct estimation infeasible. To overcome this limitation, we employ a hypothesis-testing approach to infer CP from sampled data, allowing for robust estimation despite missing information. Even within the sample collected from spatial contact data, individuals possess partial knowledge of their neighborhoods, as their awareness is restricted to interactions captured by available tracking data. We introduce an adjustment factor that calibrates the sample CPs so that the sample is a reasonable estimate of the population CP. Further complicating estimation, biases in epidemiological and mobility data arise from heterogeneous reporting rates and sampling inconsistencies, which we address throughinverse probability weightingto enhance reliability. Using a spatial model for infection spread through social mixing and an optimization framework based on the SIRS epidemic model, we analyze real infection datasets from Italy, Germany, and Austria. Our findings demonstrate that statistical methods can achieve high-confidence CP estimates while accounting for variations in sample size, confidence level, mobility models, and viral strains. By assessing the effects of bias, social mixing, and sampling frequency, we propose statistical corrections to improve CP prediction accuracy. Finally, we discuss how reliable CP estimates can inform outbreak mitigation strategies despite the inherent uncertainties in epidemiological data.  more » « less
Award ID(s):
2316003
PAR ID:
10645372
Author(s) / Creator(s):
; ;
Editor(s):
Arunachalam, Viswanathan
Publisher / Repository:
PLOS
Date Published:
Journal Name:
PLOS One
Volume:
20
Issue:
8
ISSN:
1932-6203
Page Range / eLocation ID:
e0329828
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider epidemiological modeling for the design of COVID-19 interventions in university populations, which have seen significant outbreaks during the pandemic. A central challenge is sensitivity of predictions to input parameters coupled with uncertainty about these parameters. Nearly 2 y into the pandemic, parameter uncertainty remains because of changes in vaccination efficacy, viral variants, and mask mandates, and because universities’ unique characteristics hinder translation from the general population: a high fraction of young people, who have higher rates of asymptomatic infection and social contact, as well as an enhanced ability to implement behavioral and testing interventions. We describe an epidemiological model that formed the basis for Cornell University’s decision to reopen for in-person instruction in fall 2020 and supported the design of an asymptomatic screening program instituted concurrently to prevent viral spread. We demonstrate how the structure of these decisions allowed risk to be minimized despite parameter uncertainty leading to an inability to make accurate point estimates and how this generalizes to other university settings. We find that once-per-week asymptomatic screening of vaccinated undergraduate students provides substantial value against the Delta variant, even if all students are vaccinated, and that more targeted testing of the most social vaccinated students provides further value. 
    more » « less
  2. Sampling for prevalence estimation of infection is subject to bias by both over- sampling of symptomatic individuals and error-prone tests. This results in naïve estimators of prevalence (ie, proportion of observed infected individuals in the sample) that can be very far from the true proportion of infected. In this work, we present a method of prevalence estimation that reduces both the effect of bias due to testing errors and oversampling of symptomatic individuals, eliminat- ing it altogether in some scenarios. Moreover, this procedure considers stratified errors in which tests have different error rate profiles for symptomatic and asymptomatic individuals. This results in easily implementable algorithms, for which code is provided, that produce better prevalence estimates than other methods (in terms of reducing and/or removing bias), as demonstrated by formal results, simulations, and on COVID-19 data from the Israeli Ministry of Health. 
    more » « less
  3. Background: Estimating the infection fatality rate (IFR) for emerging diseases is elusive due to the presence of asymptomatic or mildly symptomatic infections and variable testing capacity. IFR estimates are also affected by region-specific differences in sampling regimes, demographics, and healthcare resources. Methods: Here we present a novel regression approach using population testing and readily available case fatality rates (CFR) to estimate the IFR during an outbreak. The approach is based on few assumptions and can be used for a wide range of emerging diseases. We validate the use of the method using commonly reported COVID-19 testing data. Results: Our new statistical approach reveals a conservative global IFR of 0.90 % (CI: 0.70 %, 1.16 %) for COVID-19 across the 139 countries affected before May 2020. Deviation of countries’ reported CFR from the estimator did not correlate with demography, per capita GDP, or healthcare access and quality, suggesting variation is due to differing testing regimes or reporting guidelines by country. Conclusions: This method can be used retrospectively or for future disease outbreaks when other data are limited. 
    more » « less
  4. null (Ed.)
    The contributions of asymptomatic infections to herd immunity and community transmission are key to the resurgence and control of COVID-19, but are difficult to estimate using current models that ignore changes in testing capacity. Using a model that incorporates daily testing information fit to the case and serology data from New York City, we show that the proportion of symptomatic cases is low, ranging from 13 to 18%, and that the reproductive number may be larger than often assumed. Asymptomatic infections contribute substantially to herd immunity, and to community transmission together with presymptomatic ones. If asymptomatic infections transmit at similar rates as symptomatic ones, the overall reproductive number across all classes is larger than often assumed, with estimates ranging from 3.2 to 4.4. If they transmit poorly, then symptomatic cases have a larger reproductive number ranging from 3.9 to 8.1. Even in this regime, presymptomatic and asymptomatic cases together comprise at least 50% of the force of infection at the outbreak peak. We find no regimes in which all infection subpopulations have reproductive numbers lower than three. These findings elucidate the uncertainty that current case and serology data cannot resolve, despite consideration of different model structures. They also emphasize how temporal data on testing can reduce and better define this uncertainty, as we move forward through longer surveillance and second epidemic waves. Complementary information is required to determine the transmissibility of asymptomatic cases, which we discuss. Regardless, current assumptions about the basic reproductive number of severe acute respiratory syndrome coronavirus 2 (SARS-Cov-2) should be reconsidered. 
    more » « less
  5. Abstract The evolution of the COVID-19 pandemic is described through a time-dependent stochastic dynamic model in discrete time. The proposed multi-compartment model is expressed through a system of difference equations. Information on the social distancing measures and diagnostic testing rates are incorporated to characterize the dynamics of the various compartments of the model. In contrast with conventional epidemiological models, the proposed model involves interpretable temporally static and dynamic epidemiological rate parameters. A model fitting strategy built upon nonparametric smoothing is employed for estimating the time-varying parameters, while profiling over the time-independent parameters. Confidence bands of the parameters are obtained through a residual bootstrap procedure. A key feature of the methodology is its ability to estimate latent unobservable compartments such as the number of asymptomatic but infected individuals who are known to be the key vectors of COVID-19 spread. The nature of the disease dynamics is further quantified by relevant epidemiological markers that make use of the estimates of latent compartments. The methodology is applied to understand the true extent and dynamics of the pandemic in various states within the United States (US). 
    more » « less