skip to main content


Title: A linear noise approximation for stochastic epidemic models fit to partially observed incidence counts
Abstract

Stochastic epidemic models (SEMs) fit to incidence data are critical to elucidating outbreak dynamics, shaping response strategies, and preparing for future epidemics. SEMs typically represent counts of individuals in discrete infection states using Markov jump processes (MJPs), but are computationally challenging as imperfect surveillance, lack of subject‐level information, and temporal coarseness of the data obscure the true epidemic. Analytic integration over the latent epidemic process is impossible, and integration via Markov chain Monte Carlo (MCMC) is cumbersome due to the dimensionality and discreteness of the latent state space. Simulation‐based computational approaches can address the intractability of the MJP likelihood, but are numerically fragile and prohibitively expensive for complex models. A linear noise approximation (LNA) that approximates the MJP transition density with a Gaussian density has been explored for analyzing prevalence data in large‐population settings, but requires modification for analyzing incidence counts without assuming that the data are normally distributed. We demonstrate how to reparameterize SEMs to appropriately analyze incidence data, and fold the LNA into a data augmentation MCMC framework that outperforms deterministic methods, statistically, and simulation‐based methods, computationally. Our framework is computationally robust when the model dynamics are complex and applies to a broad class of SEMs. We evaluate our method in simulations that reflect Ebola, influenza, and SARS‐CoV‐2 dynamics, and apply our method to national surveillance counts from the 2013–2015 West Africa Ebola outbreak.

 
more » « less
Award ID(s):
1936833
NSF-PAR ID:
10364323
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Biometrics
Volume:
78
Issue:
4
ISSN:
0006-341X
Format(s):
Medium: X Size: p. 1530-1541
Size(s):
["p. 1530-1541"]
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Background

    No versatile web app exists that allows epidemiologists and managers around the world to comprehensively analyze the impacts of COVID-19 mitigation. Thehttp://covid-webapp.numerusinc.com/web app presented here fills this gap.

    Methods

    Our web app uses a model that explicitly identifies susceptible, contact, latent, asymptomatic, symptomatic and recovered classes of individuals, and a parallel set of response classes, subject to lower pathogen-contact rates. The user inputs a CSV file of incidence and, if of interest, mortality rate data. A default set of parameters is available that can be overwritten through input or online entry, and a user-selected subset of these can be fitted to the model using maximum-likelihood estimation (MLE). Model fitting and forecasting intervals are specifiable and changes to parameters allow counterfactual and forecasting scenarios. Confidence or credible intervals can be generated using stochastic simulations, based on MLE values, or on an inputted CSV file containing Markov chain Monte Carlo (MCMC) estimates of one or more parameters.

    Results

    We illustrate the use of our web app in extracting social distancing, social relaxation, surveillance or virulence switching functions (i.e., time varying drivers) from the incidence and mortality rates of COVID-19 epidemics in Israel, South Africa, and England. The Israeli outbreak exhibits four distinct phases: initial outbreak, social distancing, social relaxation, and a second wave mitigation phase. An MCMC projection of this latter phase suggests the Israeli epidemic will continue to produce into late November an average of around 1500 new case per day, unless the population practices social-relaxation measures at least 5-fold below the level in August, which itself is 4-fold below the level at the start of July. Our analysis of the relatively late South African outbreak that became the world’s fifth largest COVID-19 epidemic in July revealed that the decline through late July and early August was characterised by a social distancing driver operating at more than twice the per-capita applicable-disease-class (pc-adc) rate of the social relaxation driver. Our analysis of the relatively early English outbreak, identified a more than 2-fold improvement in surveillance over the course of the epidemic. It also identified a pc-adc social distancing rate in early August that, though nearly four times the pc-adc social relaxation rate, appeared to barely contain a second wave that would break out if social distancing was further relaxed.

    Conclusion

    Our web app provides policy makers and health officers who have no epidemiological modelling or computer coding expertise with an invaluable tool for assessing the impacts of different outbreak mitigation policies and measures. This includes an ability to generate an epidemic-suppression or curve-flattening index that measures the intensity with which behavioural responses suppress or flatten the epidemic curve in the region under consideration.

     
    more » « less
  2. Throughout the course of an epidemic, the rate at which disease spreads varies with behavioral changes, the emergence of new disease variants, and the introduction of mitigation policies. Estimating such changes in transmission rates can help us better model and predict the dynamics of an epidemic, and provide insight into the efficacy of control and intervention strategies. We present a method for likelihood‐based estimation of parameters in the stochastic susceptible‐infected‐removed model under a time‐inhomogeneous transmission rate comprised of piecewise constant components. In doing so, our method simultaneously learns change points in the transmission rate via a Markov chain Monte Carlo algorithm. The method targets the exact model posterior in a difficult missing data setting given only partially observed case counts over time. We validate performance on simulated data before applying our approach to data from an Ebola outbreak in Western Africa and COVID‐19 outbreak on a university campus.

     
    more » « less
  3. We propose a novel Markov chain Monte‐Carlo (MCMC) method for reverse engineering the topological structure of stochastic reaction networks, a notoriously challenging problem that is relevant in many modern areas of research, like discovering gene regulatory networks or analyzing epidemic spread. The method relies on projecting the original time series trajectories, from the stochastic data generating process, onto information rich summary statistics and constructing the appropriate synthetic likelihood function to estimate reaction rates. The resulting estimates are consistent in the large volume limit and are obtained without employing complicated tuning strategies and expensive resampling as typically used by likelihood‐free MCMC and approximate Bayesian methods. To illustrate the method, we apply it in two real data examples: the molecular pathway analysis with RNA‐seq and the famous incidence data from 1665 plague outbreak at Eyam, England.

     
    more » « less
  4. Simple mathematical tools are needed to quantify the threat posed by emerging and re-emerging infectious disease outbreaks using minimal data capturing the outbreak trajectory. Here we use mathematical analysis, simulation and COVID-19 epidemic data to demonstrate a novel approach to numerically and mathematically characterize the rate at which the doubling time of an epidemic is changing over time. For this purpose, we analyze the dynamics of epidemic doubling times during the initial epidemic stage, defined as the sequence of times at which the cumulative incidence doubles. We introduce new methodology to characterize epidemic threats by analyzing the evolution of epidemics as a function of (1) the number of times the epidemic doubles until the epidemic peak is reached and (2) the rate at which the doubling times increase. In our doubling-time approach, the most dangerous epidemic threats double in size many times and the doubling times change at a relatively low rate (e.g., doubling times remain nearly invariant) whereas the least transmissible threats double in size only a few times and the doubling times rapidly increases in the period of emergence. We derive analytical formulas and test and illustrate our methodology using synthetic and COVID-19 epidemic data. Our mathematical analysis demonstrates that the series of epidemic doubling times increase approximately according to an exponential function with a rate that quantifies the rate of change of the doubling times. Our analytic results are in excellent agreement with numerical results. Our methodology offers a simple and intuitive approach that relies on minimal outbreak trajectory data to characterize the threat posed by emerging and re-emerging infectious diseases. 
    more » « less
  5. null (Ed.)
    Abstract Background Ensemble modeling aims to boost the forecasting performance by systematically integrating the predictive accuracy across individual models. Here we introduce a simple-yet-powerful ensemble methodology for forecasting the trajectory of dynamic growth processes that are defined by a system of non-linear differential equations with applications to infectious disease spread. Methods We propose and assess the performance of two ensemble modeling schemes with different parametric bootstrapping procedures for trajectory forecasting and uncertainty quantification. Specifically, we conduct sequential probabilistic forecasts to evaluate their forecasting performance using simple dynamical growth models with good track records including the Richards model, the generalized-logistic growth model, and the Gompertz model. We first test and verify the functionality of the method using simulated data from phenomenological models and a mechanistic transmission model. Next, the performance of the method is demonstrated using a diversity of epidemic datasets including scenario outbreak data of the Ebola Forecasting Challenge and real-world epidemic data outbreaks of including influenza, plague, Zika, and COVID-19. Results We found that the ensemble method that randomly selects a model from the set of individual models for each time point of the trajectory of the epidemic frequently outcompeted the individual models as well as an alternative ensemble method based on the weighted combination of the individual models and yields broader and more realistic uncertainty bounds for the trajectory envelope, achieving not only better coverage rate of the 95% prediction interval but also improved mean interval scores across a diversity of epidemic datasets. Conclusion Our new methodology for ensemble forecasting outcompete component models and an alternative ensemble model that differ in how the variance is evaluated for the generation of the prediction intervals of the forecasts. 
    more » « less