Simple dynamic modeling tools can help generate real-time short-term forecasts with quantified uncertainty of the trajectory of diverse growth processes unfolding in nature and society, including disease outbreaks. An easy-to-use and flexible toolbox for this purpose is lacking. This tutorial-based primer introduces and illustrates
Dynamical mathematical models defined by a system of differential equations are typically not easily accessible to non-experts. However, forecasts based on these types of models can help gain insights into the mechanisms driving the process and may outcompete simpler phenomenological growth models. Here we introduce a friendly toolbox,
This tutorial-based primer introduces and illustrates a user-friendly MATLAB toolbox for fitting and forecasting time-series trajectories using an ensemble spatial wave sub-epidemic model based on ordinary differential equations. Scientists, policymakers, and students can use the toolbox to conduct real-time short-term forecasts. The five-parameter epidemic wave model in the toolbox aggregates linked overlapping sub-epidemics and captures a rich spectrum of epidemic wave dynamics, including oscillatory wave behavior and plateaus. An ensemble strategy aims to improve forecasting performance by combining the resulting top-ranked models. The toolbox provides a tutorial for forecasting time-series trajectories, including the full uncertainty distribution derived through parametric bootstrapping, which is needed to construct prediction intervals and evaluate their accuracy. Functions are available to assess forecasting performance, estimation methods, error structures in the data, and forecasting horizons. The toolbox also includes functions to quantify forecasting performance using metrics that evaluate point and distributional forecasts, including the weighted interval score.
We have developed the first comprehensive toolbox to characterize and forecast time-series data using an ensemble spatial wave sub-epidemic wave model. As an epidemic situation or contagion occurs, the tools presented in this tutorial can facilitate policymakers to guide the implementation of containment strategies and assess the impact of control interventions. We demonstrate the functionality of the toolbox with examples, including a tutorial video, and is illustrated using daily data on the COVID-19 pandemic in the USA.
- PAR ID:
- 10513072
- Publisher / Repository:
- Springer Science + Business Media
- Date Published:
- Journal Name:
- BMC Medical Research Methodology
- Volume:
- 24
- Issue:
- 1
- ISSN:
- 1471-2288
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract GrowthPredict , a user-friendly MATLAB toolbox for fitting and forecasting time-series trajectories using phenomenological dynamic growth models based on ordinary differential equations. This toolbox is accessible to a broad audience, including students training in mathematical biology, applied statistics, and infectious disease modeling, as well as researchers and policymakers who need to conduct short-term forecasts in real-time. The models included in the toolbox capture exponential and sub-exponential growth patterns that typically follow a rising pattern followed by a decline phase, a common feature of contagion processes. Models include the 1-parameter exponential growth model and the 2-parameter generalized-growth model, which have proven useful in characterizing and forecasting the ascending phase of epidemic outbreaks. It also includes the 2-parameter Gompertz model, the 3-parameter generalized logistic-growth model, and the 3-parameter Richards model, which have demonstrated competitive performance in forecasting single peak outbreaks. We provide detailed guidance on forecasting time-series trajectories and available software (https://github.com/gchowell/forecasting_growthmodels ), including the full uncertainty distribution derived through parametric bootstrapping, which is needed to construct prediction intervals and evaluate their accuracy. Functions are available to assess forecasting performance across different models, estimation methods, error structures in the data, and forecasting horizons. The toolbox also includes functions to quantify forecasting performance using metrics that evaluate point and distributional forecasts, including the weighted interval score. This tutorial and toolbox can be broadly applied to characterizing and forecasting time-series data using simple phenomenological growth models. As a contagion process takes off, the tools presented in this tutorial can help create forecasts to guide policy regarding implementing control strategies and assess the impact of interventions. The toolbox functionality is demonstrated through various examples, including a tutorial video, and the examples use publicly available data on the monkeypox (mpox) epidemic in the USA. -
During the 2022–2023 unprecedented mpox epidemic, near real-time short-term forecasts of the epidemic’s trajectory were essential in intervention implementation and guiding policy. However, as case levels have significantly decreased, evaluating model performance is vital to advancing the field of epidemic forecasting. Using laboratory-confirmed mpox case data from the Centers for Disease Control and Prevention and Our World in Data teams, we generated retrospective sequential weekly forecasts for Brazil, Canada, France, Germany, Spain, the United Kingdom, the United States and at the global scale using an auto-regressive integrated moving average (ARIMA) model, generalized additive model, simple linear regression, Facebook’s Prophet model, as well as the sub-epidemic wave and
n -sub-epidemic modelling frameworks. We assessed forecast performance using average mean squared error, mean absolute error, weighted interval scores, 95% prediction interval coverage, skill scores and Winkler scores. Overall, then -sub-epidemic modelling framework outcompeted other models across most locations and forecasting horizons, with the unweighted ensemble model performing best most frequently. Then -sub-epidemic and spatial-wave frameworks considerably improved in average forecasting performance relative to the ARIMA model (greater than 10%) for all performance metrics. Findings further support sub-epidemic frameworks for short-term forecasting epidemics of emerging and re-emerging infectious diseases. -
null (Ed.)Abstract Background Ensemble modeling aims to boost the forecasting performance by systematically integrating the predictive accuracy across individual models. Here we introduce a simple-yet-powerful ensemble methodology for forecasting the trajectory of dynamic growth processes that are defined by a system of non-linear differential equations with applications to infectious disease spread. Methods We propose and assess the performance of two ensemble modeling schemes with different parametric bootstrapping procedures for trajectory forecasting and uncertainty quantification. Specifically, we conduct sequential probabilistic forecasts to evaluate their forecasting performance using simple dynamical growth models with good track records including the Richards model, the generalized-logistic growth model, and the Gompertz model. We first test and verify the functionality of the method using simulated data from phenomenological models and a mechanistic transmission model. Next, the performance of the method is demonstrated using a diversity of epidemic datasets including scenario outbreak data of the Ebola Forecasting Challenge and real-world epidemic data outbreaks of including influenza, plague, Zika, and COVID-19. Results We found that the ensemble method that randomly selects a model from the set of individual models for each time point of the trajectory of the epidemic frequently outcompeted the individual models as well as an alternative ensemble method based on the weighted combination of the individual models and yields broader and more realistic uncertainty bounds for the trajectory envelope, achieving not only better coverage rate of the 95% prediction interval but also improved mean interval scores across a diversity of epidemic datasets. Conclusion Our new methodology for ensemble forecasting outcompete component models and an alternative ensemble model that differ in how the variance is evaluated for the generation of the prediction intervals of the forecasts.more » « less
-
Abstract Background Beginning May 7, 2022, multiple nations reported an unprecedented surge in monkeypox cases. Unlike past outbreaks, differences in affected populations, transmission mode, and clinical characteristics have been noted. With the existing uncertainties of the outbreak, real-time short-term forecasting can guide and evaluate the effectiveness of public health measures.
Methods We obtained publicly available data on confirmed weekly cases of monkeypox at the global level and for seven countries (with the highest burden of disease at the time this study was initiated) from the Our World in Data (OWID) GitHub repository and CDC website. We generated short-term forecasts of new cases of monkeypox across the study areas using an ensemble n-sub-epidemic modeling framework based on weekly cases using 10-week calibration periods. We report and assess the weekly forecasts with quantified uncertainty from the top-ranked, second-ranked, and ensemble sub-epidemic models. Overall, we conducted 324 weekly sequential 4-week ahead forecasts across the models from the week of July 28th, 2022, to the week of October 13th, 2022.
Results The last 10 of 12 forecasting periods (starting the week of August 11th, 2022) show either a plateauing or declining trend of monkeypox cases for all models and areas of study. According to our latest 4-week ahead forecast from the top-ranked model, a total of 6232 (95% PI 487.8, 12,468.0) cases could be added globally from the week of 10/20/2022 to the week of 11/10/2022. At the country level, the top-ranked model predicts that the USA will report the highest cumulative number of new cases for the 4-week forecasts (median based on OWID data: 1806 (95% PI 0.0, 5544.5)). The top-ranked and weighted ensemble models outperformed all other models in short-term forecasts.
Conclusions Our top-ranked model consistently predicted a decreasing trend in monkeypox cases on the global and country-specific scale during the last ten sequential forecasting periods. Our findings reflect the potential impact of increased immunity, and behavioral modification among high-risk populations.
-
Abstract Producing high-quality forecasts of key climate variables, such as temperature and precipitation, on subseasonal time scales has long been a gap in operational forecasting. This study explores an application of machine learning (ML) models as postprocessing tools for subseasonal forecasting. Lagged numerical ensemble forecasts (i.e., an ensemble where the members have different initialization dates) and observational data, including relative humidity, pressure at sea level, and geopotential height, are incorporated into various ML methods to predict monthly average precipitation and 2-m temperature 2 weeks in advance for the continental United States. For regression, quantile regression, and tercile classification tasks, we consider using linear models, random forests, convolutional neural networks, and stacked models (a multimodel approach based on the prediction of the individual ML models). Unlike previous ML approaches that often use ensemble mean alone, we leverage information embedded in the ensemble forecasts to enhance prediction accuracy. Additionally, we investigate extreme event predictions that are crucial for planning and mitigation efforts. Considering ensemble members as a collection of spatial forecasts, we explore different approaches to using spatial information. Trade-offs between different approaches may be mitigated with model stacking. Our proposed models outperform standard baselines such as climatological forecasts and ensemble means. In addition, we investigate feature importance, trade-offs between using the full ensemble or only the ensemble mean, and different modes of accounting for spatial variability.
Significance Statement Accurately forecasting temperature and precipitation on subseasonal time scales—2 weeks–2 months in advance—is extremely challenging. These forecasts would have immense value in agriculture, insurance, and economics. Our paper describes an application of machine learning techniques to improve forecasts of monthly average precipitation and 2-m temperature using lagged physics-based predictions and observational data 2 weeks in advance for the entire continental United States. For lagged ensembles, the proposed models outperform standard benchmarks such as historical averages and averages of physics-based predictions. Our findings suggest that utilizing the full set of physics-based predictions instead of the average enhances the accuracy of the final forecast.