This content will become publicly available on December 1, 2025
Title: Evaluation of FluSight influenza forecasting in the 2021–22 and 2022–23 seasons with a new target laboratory-confirmed influenza hospitalizations
Abstract Accurate forecasts can enable more effective public health responses during seasonal influenza epidemics. For the 2021–22 and 2022–23 influenza seasons, 26 forecasting teams provided national and jurisdiction-specific probabilistic predictions of weekly confirmed influenza hospital admissions for one-to-four weeks ahead. Forecast skill is evaluated using the Weighted Interval Score (WIS), relative WIS, and coverage. Six out of 23 models outperform the baseline model across forecast weeks and locations in 2021–22 and 12 out of 18 models in 2022–23. Averaging across all forecast targets, the FluSight ensemble is the 2ndmost accurate model measured by WIS in 2021–22 and the 5thmost accurate in the 2022–23 season. Forecast skill and 95% coverage for the FluSight ensemble and most component models degrade over longer forecast horizons. In this work we demonstrate that while the FluSight ensemble was a robust predictor, even ensembles face challenges during periods of rapid change. more »« less
Lopez, Velma K; Cramer, Estee Y; Pagano, Robert; Drake, John M; O’Dea, Eamon B; Adee, Madeline; Ayer, Turgay; Chhatwal, Jagpreet; Dalgic, Ozden O; Ladd, Mary A; et al
(, PLOS Computational Biology)
Larremore, Daniel B
(Ed.)
During the COVID-19 pandemic, forecasting COVID-19 trends to support planning and response was a priority for scientists and decision makers alike. In the United States, COVID-19 forecasting was coordinated by a large group of universities, companies, and government entities led by the Centers for Disease Control and Prevention and the US COVID-19 Forecast Hub (https://covid19forecasthub.org). We evaluated approximately 9.7 million forecasts of weekly state-level COVID-19 cases for predictions 1–4 weeks into the future submitted by 24 teams from August 2020 to December 2021. We assessed coverage of central prediction intervals and weighted interval scores (WIS), adjusting for missing forecasts relative to a baseline forecast, and used a Gaussian generalized estimating equation (GEE) model to evaluate differences in skill across epidemic phases that were defined by the effective reproduction number. Overall, we found high variation in skill across individual models, with ensemble-based forecasts outperforming other approaches. Forecast skill relative to the baseline was generally higher for larger jurisdictions (e.g., states compared to counties). Over time, forecasts generally performed worst in periods of rapid changes in reported cases (either in increasing or decreasing epidemic phases) with 95% prediction interval coverage dropping below 50% during the growth phases of the winter 2020, Delta, and Omicron waves. Ideally, case forecasts could serve as a leading indicator of changes in transmission dynamics. However, while most COVID-19 case forecasts outperformed a naïve baseline model, even the most accurate case forecasts were unreliable in key phases. Further research could improve forecasts of leading indicators, like COVID-19 cases, by leveraging additional real-time data, addressing performance across phases, improving the characterization of forecast confidence, and ensuring that forecasts were coherent across spatial scales. In the meantime, it is critical for forecast users to appreciate current limitations and use a broad set of indicators to inform pandemic-related decision making.
Background:Short-term forecasts of infectious disease burden can contribute to situational awareness and aid capacity planning. Based on best practice in other fields and recent insights in infectious disease epidemiology, one can maximise the predictive performance of such forecasts if multiple models are combined into an ensemble. Here, we report on the performance of ensembles in predicting COVID-19 cases and deaths across Europe between 08 March 2021 and 07 March 2022. Methods:We used open-source tools to develop a public European COVID-19 Forecast Hub. We invited groups globally to contribute weekly forecasts for COVID-19 cases and deaths reported by a standardised source for 32 countries over the next 1–4 weeks. Teams submitted forecasts from March 2021 using standardised quantiles of the predictive distribution. Each week we created an ensemble forecast, where each predictive quantile was calculated as the equally-weighted average (initially the mean and then from 26th July the median) of all individual models’ predictive quantiles. We measured the performance of each model using the relative Weighted Interval Score (WIS), comparing models’ forecast accuracy relative to all other models. We retrospectively explored alternative methods for ensemble forecasts, including weighted averages based on models’ past predictive performance. Results:Over 52 weeks, we collected forecasts from 48 unique models. We evaluated 29 models’ forecast scores in comparison to the ensemble model. We found a weekly ensemble had a consistently strong performance across countries over time. Across all horizons and locations, the ensemble performed better on relative WIS than 83% of participating models’ forecasts of incident cases (with a total N=886 predictions from 23 unique models), and 91% of participating models’ forecasts of deaths (N=763 predictions from 20 models). Across a 1–4 week time horizon, ensemble performance declined with longer forecast periods when forecasting cases, but remained stable over 4 weeks for incident death forecasts. In every forecast across 32 countries, the ensemble outperformed most contributing models when forecasting either cases or deaths, frequently outperforming all of its individual component models. Among several choices of ensemble methods we found that the most influential and best choice was to use a median average of models instead of using the mean, regardless of methods of weighting component forecast models. Conclusions:Our results support the use of combining forecasts from individual models into an ensemble in order to improve predictive performance across epidemiological targets and populations during infectious disease epidemics. Our findings further suggest that median ensemble methods yield better predictive performance more than ones based on means. Our findings also highlight that forecast consumers should place more weight on incident death forecasts than incident case forecasts at forecast horizons greater than 2 weeks. Funding:AA, BH, BL, LWa, MMa, PP, SV funded by National Institutes of Health (NIH) Grant 1R01GM109718, NSF BIG DATA Grant IIS-1633028, NSF Grant No.: OAC-1916805, NSF Expeditions in Computing Grant CCF-1918656, CCF-1917819, NSF RAPID CNS-2028004, NSF RAPID OAC-2027541, US Centers for Disease Control and Prevention 75D30119C05935, a grant from Google, University of Virginia Strategic Investment Fund award number SIF160, Defense Threat Reduction Agency (DTRA) under Contract No. HDTRA1-19-D-0007, and respectively Virginia Dept of Health Grant VDH-21-501-0141, VDH-21-501-0143, VDH-21-501-0147, VDH-21-501-0145, VDH-21-501-0146, VDH-21-501-0142, VDH-21-501-0148. AF, AMa, GL funded by SMIGE - Modelli statistici inferenziali per governare l'epidemia, FISR 2020-Covid-19 I Fase, FISR2020IP-00156, Codice Progetto: PRJ-0695. AM, BK, FD, FR, JK, JN, JZ, KN, MG, MR, MS, RB funded by Ministry of Science and Higher Education of Poland with grant 28/WFSN/2021 to the University of Warsaw. BRe, CPe, JLAz funded by Ministerio de Sanidad/ISCIII. BT, PG funded by PERISCOPE European H2020 project, contract number 101016233. CP, DL, EA, MC, SA funded by European Commission - Directorate-General for Communications Networks, Content and Technology through the contract LC-01485746, and Ministerio de Ciencia, Innovacion y Universidades and FEDER, with the project PGC2018-095456-B-I00. DE., MGu funded by Spanish Ministry of Health / REACT-UE (FEDER). DO, GF, IMi, LC funded by Laboratory Directed Research and Development program of Los Alamos National Laboratory (LANL) under project number 20200700ER. DS, ELR, GG, NGR, NW, YW funded by National Institutes of General Medical Sciences (R35GM119582; the content is solely the responsibility of the authors and does not necessarily represent the official views of NIGMS or the National Institutes of Health). FB, FP funded by InPresa, Lombardy Region, Italy. HG, KS funded by European Centre for Disease Prevention and Control. IV funded by Agencia de Qualitat i Avaluacio Sanitaries de Catalunya (AQuAS) through contract 2021-021OE. JDe, SMo, VP funded by Netzwerk Universitatsmedizin (NUM) project egePan (01KX2021). JPB, SH, TH funded by Federal Ministry of Education and Research (BMBF; grant 05M18SIA). KH, MSc, YKh funded by Project SaxoCOV, funded by the German Free State of Saxony. Presentation of data, model results and simulations also funded by the NFDI4Health Task Force COVID-19 (https://www.nfdi4health.de/task-force-covid-19-2) within the framework of a DFG-project (LO-342/17-1). LP, VE funded by Mathematical and Statistical modelling project (MUNI/A/1615/2020), Online platform for real-time monitoring, analysis and management of epidemic situations (MUNI/11/02202001/2020); VE also supported by RECETOX research infrastructure (Ministry of Education, Youth and Sports of the Czech Republic: LM2018121), the CETOCOEN EXCELLENCE (CZ.02.1.01/0.0/0.0/17-043/0009632), RECETOX RI project (CZ.02.1.01/0.0/0.0/16-013/0001761). NIB funded by Health Protection Research Unit (grant code NIHR200908). SAb, SF funded by Wellcome Trust (210758/Z/18/Z).
Venkatramanan, Srinivasan; Sadilek, Adam; Fadikar, Arindam; Barrett, Christopher L.; Biggerstaff, Matthew; Chen, Jiangzhuo; Dotiwalla, Xerxes; Eastham, Paul; Gipson, Bryant; Higdon, Dave; et al
(, Nature Communications)
Abstract Human mobility is a primary driver of infectious disease spread. However, existing data is limited in availability, coverage, granularity, and timeliness. Data-driven forecasts of disease dynamics are crucial for decision-making by health officials and private citizens alike. In this work, we focus on a machine-learned anonymized mobility map (hereon referred to as AMM) aggregated over hundreds of millions of smartphones and evaluate its utility in forecasting epidemics. We factor AMM into a metapopulation model to retrospectively forecast influenza in the USA and Australia. We show that the AMM model performs on-par with those based on commuter surveys, which are sparsely available and expensive. We also compare it with gravity and radiation based models of mobility, and find that the radiation model’s performance is quite similar to AMM and commuter flows. Additionally, we demonstrate our model’s ability to predict disease spread even across state boundaries. Our work contributes towards developing timely infectious disease forecasting at a global scale using human mobility datasets expanding their applications in the area of infectious disease epidemiology.
Kim, H.; Ham, Y. G.; Joo, Y. S.; Son, S. W.
(, Nature Communications)
Abstract Producing accurate weather prediction beyond two weeks is an urgent challenge due to its ever-increasing socioeconomic value. The Madden-Julian Oscillation (MJO), a planetary-scale tropical convective system, serves as a primary source of global subseasonal (i.e., targeting three to four weeks) predictability. During the past decades, operational forecasting systems have improved substantially, while the MJO prediction skill has not yet reached its potential predictability, partly due to the systematic errors caused by imperfect numerical models. Here, to improve the MJO prediction skill, we blend the state-of-the-art dynamical forecasts and observations with a Deep Learning bias correction method. With Deep Learning bias correction, multi-model forecast errors in MJO amplitude and phase averaged over four weeks are significantly reduced by about 90% and 77%, respectively. Most models show the greatest improvement for MJO events starting from the Indian Ocean and crossing the Maritime Continent.
Abatzoglou, John T.; McEvoy, Daniel J.; Nauslar, Nicholas J.; Hegewisch, Katherine C.; Huntington, Justin L.
(, Atmospheric Science Letters)
Abstract The increasing complexity and impacts of fire seasons in the United States have prompted efforts to improve early warning systems for wildland fire management. Outlooks of potential fire activity at lead‐times of several weeks can help in wildland fire resource allocation as well as complement short‐term meteorological forecasts for ongoing fire events. Here, we describe an experimental system for developing downscaled ensemble‐based subseasonal forecasts for the contiguous US using NCEP's operational Climate Forecast System version 2 model. These forecasts are used to calculate forecasted fire danger indices from the United States (US) National Fire Danger Rating System in addition to forecasts of evaporative demand. We further illustrate the skill of subseasonal forecasts on weekly timescales using hindcasts from 2011 to 2021. Results show that while forecast skill degrades with time, statistically significant week 3 correlative skill was found for 76% and 30% of the contiguous US for Energy Release Component and evaporative demand, respectively. These results highlight the potential value of experimental subseasonal forecasts in complementing existing information streams in weekly‐to‐monthly fire business decision making for suppression‐based decisions and geographic reallocation of resources during the fire season, as well for proactive fire management actions outside of the core fire season.
@article{osti_10581288,
place = {Country unknown/Code not available},
title = {Evaluation of FluSight influenza forecasting in the 2021–22 and 2022–23 seasons with a new target laboratory-confirmed influenza hospitalizations},
url = {https://par.nsf.gov/biblio/10581288},
DOI = {10.1038/s41467-024-50601-9},
abstractNote = {Abstract Accurate forecasts can enable more effective public health responses during seasonal influenza epidemics. For the 2021–22 and 2022–23 influenza seasons, 26 forecasting teams provided national and jurisdiction-specific probabilistic predictions of weekly confirmed influenza hospital admissions for one-to-four weeks ahead. Forecast skill is evaluated using the Weighted Interval Score (WIS), relative WIS, and coverage. Six out of 23 models outperform the baseline model across forecast weeks and locations in 2021–22 and 12 out of 18 models in 2022–23. Averaging across all forecast targets, the FluSight ensemble is the 2ndmost accurate model measured by WIS in 2021–22 and the 5thmost accurate in the 2022–23 season. Forecast skill and 95% coverage for the FluSight ensemble and most component models degrade over longer forecast horizons. In this work we demonstrate that while the FluSight ensemble was a robust predictor, even ensembles face challenges during periods of rapid change.},
journal = {Nature Communications},
volume = {15},
number = {1},
publisher = {Nature Communications},
author = {Mathis, Sarabeth M and Webber, Alexander E and León, Tomás M and Murray, Erin L and Sun, Monica and White, Lauren A and Brooks, Logan C and Green, Alden and Hu, Addison J and Rosenfeld, Roni and Shemetov, Dmitry and Tibshirani, Ryan J and McDonald, Daniel J and Kandula, Sasikiran and Pei, Sen and Yaari, Rami and Yamana, Teresa K and Shaman, Jeffrey and Agarwal, Pulak and Balusu, Srikar and Gururajan, Gautham and Kamarthi, Harshavardhan and Prakash, B Aditya and Raman, Rishi and Zhao, Zhiyuan and Rodríguez, Alexander and Meiyappan, Akilan and Omar, Shalina and Baccam, Prasith and Gurung, Heidi L and Suchoski, Brad T and Stage, Steve A and Ajelli, Marco and Kummer, Allisandra G and Litvinova, Maria and Ventura, Paulo C and Wadsworth, Spencer and Niemi, Jarad and Carcelen, Erica and Hill, Alison L and Loo, Sara L and McKee, Clifton D and Sato, Koji and Smith, Claire and Truelove, Shaun and Jung, Sung-mok and Lemaitre, Joseph C and Lessler, Justin and McAndrew, Thomas and Ye, Wenxuan and Bosse, Nikos and Hlavacek, William S and Lin, Yen Ting and Mallela, Abhishek and Gibson, Graham C and Chen, Ye and Lamm, Shelby M and Lee, Jaechoul and Posner, Richard G and Perofsky, Amanda C and Viboud, Cécile and Clemente, Leonardo and Lu, Fred and Meyer, Austin G and Santillana, Mauricio and Chinazzi, Matteo and Davis, Jessica T and Mu, Kunpeng and Pastore_y_Piontti, Ana and Vespignani, Alessandro and Xiong, Xinyue and Ben-Nun, Michal and Riley, Pete and Turtle, James and Hulme-Lowe, Chis and Jessa, Shakeel and Nagraj, V P and Turner, Stephen D and Williams, Desiree and Basu, Avranil and Drake, John M and Fox, Spencer J and Suez, Ehsan and Cojocaru, Monica G and Thommes, Edward W and Cramer, Estee Y and Gerding, Aaron and Stark, Ariane and Ray, Evan L and Reich, Nicholas G and Shandross, Li and Wattanachit, Nutcha and Wang, Yijin and Zorn, Martha W and Aawar, Majd Al and Srivastava, Ajitesh and Meyers, Lauren A and Adiga, Aniruddha and Hurt, Benjamin and Kaur, Gursharn and Lewis, Bryan L and Marathe, Madhav and Venkatramanan, Srinivasan and Butler, Patrick and Farabow, Andrew and Ramakrishnan, Naren and Muralidhar, Nikhil and Reed, Carrie and Biggerstaff, Matthew and Borchering, Rebecca K},
}
Warning: Leaving National Science Foundation Website
You are now leaving the National Science Foundation website to go to a non-government website.
Website:
NSF takes no responsibility for and exercises no control over the views expressed or the accuracy of
the information contained on this site. Also be aware that NSF's privacy policy does not apply to this site.