skip to main content

This content will become publicly available on December 1, 2023

Title: The United States COVID-19 Forecast Hub dataset
Abstract Academic researchers, government agencies, industry groups, and individuals have produced forecasts at an unprecedented scale during the COVID-19 pandemic. To leverage these forecasts, the United States Centers for Disease Control and Prevention (CDC) partnered with an academic research lab at the University of Massachusetts Amherst to create the US COVID-19 Forecast Hub. Launched in April 2020, the Forecast Hub is a dataset with point and probabilistic forecasts of incident cases, incident hospitalizations, incident deaths, and cumulative deaths due to COVID-19 at county, state, and national, levels in the United States. Included forecasts represent a variety of modeling approaches, data sources, and assumptions regarding the spread of COVID-19. The goal of this dataset is to establish a standardized and comparable set of short-term forecasts from modeling teams. These data can be used to develop ensemble models, communicate forecasts to the public, create visualizations, compare models, and inform policies regarding COVID-19 mitigation. These open-source data are available via download from GitHub, through an online API, and through R packages.
Authors:
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; « less
Award ID(s):
2035361 1835939 1749854 2035360
Publication Date:
NSF-PAR ID:
10354905
Journal Name:
Scientific Data
Volume:
9
Issue:
1
ISSN:
2052-4463
Sponsoring Org:
National Science Foundation
More Like this
  1. Short-term probabilistic forecasts of the trajectory of the COVID-19 pandemic in the United States have served as a visible and important communication channel between the scientific modeling community and both the general public and decision-makers. Forecasting models provide specific, quantitative, and evaluable predictions that inform short-term decisions such as healthcare staffing needs, school closures, and allocation of medical supplies. Starting in April 2020, the US COVID-19 Forecast Hub ( https://covid19forecasthub.org/ ) collected, disseminated, and synthesized tens of millions of specific predictions from more than 90 different academic, industry, and independent research groups. A multimodel ensemble forecast that combined predictions from dozens of groups every week provided the most consistently accurate probabilistic forecasts of incident deaths due to COVID-19 at the state and national level from April 2020 through October 2021. The performance of 27 individual models that submitted complete forecasts of COVID-19 deaths consistently throughout this year showed high variability in forecast skill across time, geospatial units, and forecast horizons. Two-thirds of the models evaluated showed better accuracy than a naïve baseline model. Forecast accuracy degraded as models made predictions further into the future, with probabilistic error at a 20-wk horizon three to five times larger than when predicting atmore »a 1-wk horizon. This project underscores the role that collaboration and active coordination between governmental public-health agencies, academic modeling teams, and industry partners can play in developing modern modeling capabilities to support local, state, and federal response to outbreaks.« less
  2. In Spring 2021, the highly transmissible SARS-CoV-2 Delta variant began to cause increases in cases, hospitalizations, and deaths in parts of the United States. At the time, with slowed vaccination uptake, this novel variant was expected to increase the risk of pandemic resurgence in the US in summer and fall 2021. As part of the COVID-19 Scenario Modeling Hub, an ensemble of nine mechanistic models produced 6-month scenario projections for July–December 2021 for the United States. These projections estimated substantial resurgences of COVID-19 across the US resulting from the more transmissible Delta variant, projected to occur across most of the US, coinciding with school and business reopening. The scenarios revealed that reaching higher vaccine coverage in July–December 2021 reduced the size and duration of the projected resurgence substantially, with the expected impacts was largely concentrated in a subset of states with lower vaccination coverage. Despite accurate projection of COVID-19 surges occurring and timing, the magnitude was substantially underestimated 2021 by the models compared with the of the reported cases, hospitalizations, and deaths occurring during July–December, highlighting the continued challenges to predict the evolving COVID-19 pandemic. Vaccination uptake remains critical to limiting transmission and disease, particularly in states with lower vaccinationmore »coverage. Higher vaccination goals at the onset of the surge of the new variant were estimated to avert over 1.5 million cases and 21,000 deaths, although may have had even greater impacts, considering the underestimated resurgence magnitude from the model.« less
  3. Tracking the COVID-19 pandemic has been a major challenge for policy makers. Although, several efforts are ongoing for accurate forecasting of cases, deaths, and hospitalization at various resolutions, few have been attempted for college campuses despite their potential to become COVID-19 hot-spots. In this paper, we present a real-time effort towards weekly forecasting of campus-level cases during the fall semester for four universities in Virginia, United States. We discuss the challenges related to data curation. A causal model is employed for forecasting with one free time-varying parameter, calibrated against case data. The model is then run forward in time to obtain multiple forecasts. We retrospectively evaluate the performance and, while forecast quality suffers during the campus reopening phase, the model makes reasonable forecasts as the fall semester progresses. We provide sensitivity analysis for the several model parameters. In addition, the forecasts are provided weekly to various state and local agencies.
  4. Abstract

    Travel patterns and mobility affect the spread of infectious diseases like COVID-19. However, we do not know to what extent local vs. visitor mobility affects the growth in the number of cases. This study evaluates the impact of state-level local vs. visitor mobility in understanding the growth with respect to the number of cases for COVID spread in the United States between March 1, 2020, and December 31, 2020. Two metrics, namely local and visitor transmission risk, were extracted from mobility data to capture the transmission potential of COVID-19 through mobility. A combination of the three factors: the current number of cases, local transmission risk, and the visitor transmission risk, are used to model the future number of cases using various machine learning models. The factors that contribute to better forecast performance are the ones that impact the number of cases. The statistical significance of the forecasts is also evaluated using the Diebold–Mariano test. Finally, the performance of models is compared for three waves across all 50 states. The results show that visitor mobility significantly impacts the case growth by improving the prediction accuracy by 33.78%. We also observe that the impact of visitor mobility is more pronounced duringmore »the first peak, i.e., March–June 2020.

    « less
  5. Influenza infects an estimated 9–35 million individuals each year in the United States and is a contributing cause for between 12,000 and 56,000 deaths annually. Seasonal outbreaks of influenza are common in temperate regions of the world, with highest incidence typically occurring in colder and drier months of the year. Real-time forecasts of influenza transmission can inform public health response to outbreaks. We present the results of a multiinstitution collaborative effort to standardize the collection and evaluation of forecasting models for influenza in the United States for the 2010/2011 through 2016/2017 influenza seasons. For these seven seasons, we assembled weekly real-time forecasts of seven targets of public health interest from 22 different models. We compared forecast accuracy of each model relative to a historical baseline seasonal average. Across all regions of the United States, over half of the models showed consistently better performance than the historical baseline when forecasting incidence of influenza-like illness 1 wk, 2 wk, and 3 wk ahead of available data and when forecasting the timing and magnitude of the seasonal peak. In some regions, delays in data reporting were strongly and negatively associated with forecast accuracy. More timely reporting and an improved overall accessibility to novelmore »and traditional data sources are needed to improve forecasting accuracy and its integration with real-time public health decision making.

    « less