skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on December 1, 2025

Title: Investigating and forecasting infectious disease dynamics using epidemiological and molecular surveillance data
The integration of viral genomic data into public health surveillance has revolutionized our ability to track and forecast infectious disease dynamics. This review addresses two critical aspects of infectious disease forecasting and monitoring: the methodological workflow for epidemic forecasting and the transformative role of molecular surveillance. We first present a detailed approach for validating epidemic models, emphasizing an iterative workflow that utilizes ordinary differential equation (ODE)-based models to investigate and forecast disease dynamics. We recommend a more structured approach to model validation, systematically addressing key stages such as model calibration, assessment of structural and practical parameter identifiability, and effective uncertainty propagation in forecasts. Furthermore, we underscore the importance of incorporating multiple data streams by applying both simulated and real epidemiological data from the COVID-19 pandemic to produce more reliable forecasts with quantified uncertainty. Additionally, we emphasize the pivotal role of viral genomic data in tracking transmission dynamics and pathogen evolution. By leveraging advanced computational tools such as Bayesian phylogenetics and phylodynamics, researchers can more accurately estimate transmission clusters and reconstruct outbreak histories, thereby improving data-driven modeling and forecasting and informing targeted public health interventions. Finally, we discuss the transformative potential of integrating molecular epidemiology with mathematical modeling to complement and enhance epidemic forecasting and optimize public health strategies.  more » « less
Award ID(s):
2415564 2412914
PAR ID:
10574720
Author(s) / Creator(s):
;
Publisher / Repository:
Elsevier
Date Published:
Journal Name:
Physics of Life Reviews
Volume:
51
Issue:
C
ISSN:
1571-0645
Page Range / eLocation ID:
294 to 327
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Prompt surveillance and forecasting of COVID-19 spread are of critical importance for slowing down the pandemic and for the success of any public mitigation efforts. However, as with any infectious disease with rapid transmission and high virulence, lack of COVID-19 observations for near-real-time forecasting is still the key challenge obstructing operational disease prediction and control. In this context, we can follow the two approaches to forecasting COVID-19 dynamics: based on mechanistic models and based on machine learning. Mechanistic models are better in capturing an epidemiological curve, using a low amount of data, and describing the overall trajectory of the disease dynamics, hence, providing long-term insights into where the disease might go. Machine learning, in turn, can provide more precise data-driven forecasts especially in the short-term horizons, while suffering from limited interpretability and usually requiring backlog history on the infectious disease. We propose a unified reinforcement learning framework that combines the two approaches. That is, long-term trajectory forecasts are used in machine learning techniques to forecast local variability which is not captured by the mechanistic model. 
    more » « less
  2. Adrish, Muhammad (Ed.)
    Mexico has experienced one of the highest COVID-19 mortality rates in the world. A delayed implementation of social distancing interventions in late March 2020 and a phased reopening of the country in June 2020 has facilitated sustained disease transmission in the region. In this study we systematically generate and compare 30-day ahead forecasts using previously validated growth models based on mortality trends from the Institute for Health Metrics and Evaluation for Mexico and Mexico City in near real-time. Moreover, we estimate reproduction numbers for SARS-CoV-2 based on the methods that rely on genomic data as well as case incidence data. Subsequently, functional data analysis techniques are utilized to analyze the shapes of COVID-19 growth rate curves at the state level to characterize the spatiotemporal transmission patterns of SARS-CoV-2. The early estimates of the reproduction number for Mexico were estimated between R t ~1.1–1.3 from the genomic and case incidence data. Moreover, the mean estimate of R t has fluctuated around ~1.0 from late July till end of September 2020. The spatial analysis characterizes the state-level dynamics of COVID-19 into four groups with distinct epidemic trajectories based on epidemic growth rates. Our results show that the sequential mortality forecasts from the GLM and Richards model predict a downward trend in the number of deaths for all thirteen forecast periods for Mexico and Mexico City. However, the sub-epidemic and IHME models perform better predicting a more realistic stable trajectory of COVID-19 mortality trends for the last three forecast periods (09/21-10/21, 09/28-10/27, 09/28-10/27) for Mexico and Mexico City. Our findings indicate that phenomenological models are useful tools for short-term epidemic forecasting albeit forecasts need to be interpreted with caution given the dynamic implementation and lifting of social distancing measures. 
    more » « less
  3. null (Ed.)
    The coronavirus disease 2019 (COVID-19) pandemic caused by the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has placed epidemic modeling at the center of attention of public policymaking. Predicting the severity and speed of transmission of COVID-19 is crucial to resource management and developing strategies to deal with this epidemic. Based on the available data from current and previous outbreaks, many efforts have been made to develop epidemiological models, including statistical models, computer simulations, mathematical representations of the virus and its impacts, and many more. Despite their usefulness, modeling and forecasting the spread of COVID-19 remains a challenge. In this article, we give an overview of the unique features and issues of COVID-19 data and how they impact epidemic modeling and projection. In addition, we illustrate how various models could be connected to each other. Moreover, we provide new data science perspectives on the challenges of COVID-19 forecasting, from data collection, curation, and validation to the limitations of models, as well as the uncertainty of the forecast. Finally, we discuss some data science practices that are crucial to more robust and accurate epidemic forecasting. 
    more » « less
  4. Abstract Traditional health surveillance methods play a critical role in public health safety but are limited by the data collection speed, coverage, and resource requirements. Wastewater‐based epidemiology (WBE) has emerged as a cost‐effective and rapid tool for detecting infectious diseases through sewage analysis of disease biomarkers. Recent advances in big data analytics have enhanced public health monitoring by enabling predictive modeling and early risk detection. This paper explores the application of machine learning (ML) in WBE data analytics, with a focus on infectious disease surveillance and forecasting. We highlight the advantages of ML‐driven WBE prediction models, including their ability to process multimodal data, predict disease trends, and evaluate policy impacts through scenario simulations. We also examine challenges such as data quality, model interpretability, and integration with existing public health infrastructure. The integration of ML WBE data analytics enables rapid health data collection, analysis, and interpretation that are not feasible in current surveillance approaches. By leveraging ML and WBE, decision makers can reduce cognitive biases and enhance data‐driven responses to public health threats. As global health risks evolve, the synergy between WBE, ML, and data‐driven decision‐making holds significant potential for improving public health outcomes. 
    more » « less
  5. null (Ed.)
    Abstract Background Ensemble modeling aims to boost the forecasting performance by systematically integrating the predictive accuracy across individual models. Here we introduce a simple-yet-powerful ensemble methodology for forecasting the trajectory of dynamic growth processes that are defined by a system of non-linear differential equations with applications to infectious disease spread. Methods We propose and assess the performance of two ensemble modeling schemes with different parametric bootstrapping procedures for trajectory forecasting and uncertainty quantification. Specifically, we conduct sequential probabilistic forecasts to evaluate their forecasting performance using simple dynamical growth models with good track records including the Richards model, the generalized-logistic growth model, and the Gompertz model. We first test and verify the functionality of the method using simulated data from phenomenological models and a mechanistic transmission model. Next, the performance of the method is demonstrated using a diversity of epidemic datasets including scenario outbreak data of the Ebola Forecasting Challenge and real-world epidemic data outbreaks of including influenza, plague, Zika, and COVID-19. Results We found that the ensemble method that randomly selects a model from the set of individual models for each time point of the trajectory of the epidemic frequently outcompeted the individual models as well as an alternative ensemble method based on the weighted combination of the individual models and yields broader and more realistic uncertainty bounds for the trajectory envelope, achieving not only better coverage rate of the 95% prediction interval but also improved mean interval scores across a diversity of epidemic datasets. Conclusion Our new methodology for ensemble forecasting outcompete component models and an alternative ensemble model that differ in how the variance is evaluated for the generation of the prediction intervals of the forecasts. 
    more » « less