skip to main content

Title: A Multi‐Model Ensemble of Baseline and Process‐Based Models Improves the Predictive Skill of Near‐Term Lake Forecasts

Water temperature forecasting in lakes and reservoirs is a valuable tool to manage crucial freshwater resources in a changing and more variable climate, but previous efforts have yet to identify an optimal modeling approach. Here, we demonstrate the first multi‐model ensemble (MME) reservoir water temperature forecast, a forecasting method that combines individual model strengths in a single forecasting framework. We developed two MMEs: a three‐model process‐based MME and a five‐model MME that includes process‐based and empirical models to forecast water temperature profiles at a temperate drinking water reservoir. We found that the five‐model MME improved forecast performance by 8%–30% relative to individual models and the process‐based MME, as quantified using an aggregated probabilistic skill score. This increase in performance was due to large improvements in forecast bias in the five‐model MME, despite increases in forecast uncertainty. High correlation among the process‐based models resulted in little improvement in forecast performance in the process‐based MME relative to the individual process‐based models. The utility of MMEs is highlighted by two results: (a) no individual model performed best at every depth and horizon (days in the future), and (b) MMEs avoided poor performances by rarely producing the worst forecast for any single forecasted period (<6% of the worst ranked forecasts over time). This work presents an example of how existing models can be combined to improve water temperature forecasting in lakes and reservoirs and discusses the value of utilizing MMEs, rather than individual models, in operational forecasts.

more » « less
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
DOI PREFIX: 10.1029
Date Published:
Journal Name:
Water Resources Research
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Freshwater ecosystems are experiencing greater variability due to human activities, necessitating new tools to anticipate future water quality. In response, we developed and deployed a real‐time iterative water temperature forecasting system (FLARE—Forecasting Lake And Reservoir Ecosystems). FLARE is composed of water temperature and meteorology sensors that wirelessly stream data, a data assimilation algorithm that uses sensor observations to update predictions from a hydrodynamic model and calibrate model parameters, and an ensemble‐based forecasting algorithm to generate forecasts that include uncertainty. Importantly, FLARE quantifies the contribution of different sources of uncertainty (driver data, initial conditions, model process, and parameters) to each daily forecast of water temperature at multiple depths. We applied FLARE to Falling Creek Reservoir (Vinton, Virginia, USA), a drinking water supply, during a 475‐day period encompassing stratified and mixed thermal conditions. Aggregated across this period, root mean square error (RMSE) of daily forecasted water temperatures was 1.13°C at the reservoir's near‐surface (1.0 m) for 7‐day ahead forecasts and 1.62°C for 16‐day ahead forecasts. The RMSE of forecasted water temperatures at the near‐sediments (8.0 m) was 0.87°C for 7‐day forecasts and 1.20°C for 16‐day forecasts. FLARE successfully predicted the onset of fall turnover 4–14 days in advance in two sequential years. Uncertainty partitioning identified meteorology driver data as the dominant source of uncertainty in forecasts for most depths and thermal conditions, except for the near‐sediments in summer, when model process uncertainty dominated. Overall, FLARE provides an open‐source system for lake and reservoir water quality forecasting to improve real‐time management.

    more » « less
  2. Abstract

    Near‐term ecological forecasts provide resource managers advance notice of changes in ecosystem services, such as fisheries stocks, timber yields, or water quality. Importantly, ecological forecasts can identify where there is uncertainty in the forecasting system, which is necessary to improve forecast skill and guide interpretation of forecast results. Uncertainty partitioning identifies the relative contributions to total forecast variance introduced by different sources, including specification of the model structure, errors in driver data, and estimation of current states (initial conditions). Uncertainty partitioning could be particularly useful in improving forecasts of highly variable cyanobacterial densities, which are difficult to predict and present a persistent challenge for lake managers. As cyanobacteria can produce toxic and unsightly surface scums, advance warning when cyanobacterial densities are increasing could help managers mitigate water quality issues. Here, we fit 13 Bayesian state‐space models to evaluate different hypotheses about cyanobacterial densities in a low nutrient lake that experiences sporadic surface scums of the toxin‐producing cyanobacterium,Gloeotrichia echinulata. We used data from several summers of weekly cyanobacteria samples to identify dominant sources of uncertainty for near‐term (1‐ to 4‐week) forecasts ofG. echinulatadensities. Water temperature was an important predictor of cyanobacterial densities during model fitting and at the 4‐week forecast horizon. However, no physical covariates improved model performance over a simple model including the previous week's densities in 1‐week‐ahead forecasts. Even the best fit models exhibited large variance in forecasted cyanobacterial densities and did not capture rare peak occurrences, indicating that significant explanatory variables when fitting models to historical data are not always effective for forecasting. Uncertainty partitioning revealed that model process specification and initial conditions dominated forecast uncertainty. These findings indicate that long‐term studies of different cyanobacterial life stages and movement in the water column as well as measurements of drivers relevant to different life stages could improve model process representation of cyanobacteria abundance. In addition, improved observation protocols could better define initial conditions and reduce spatial misalignment of environmental data and cyanobacteria observations. Our results emphasize the importance of ecological forecasting principles and uncertainty partitioning to refine and understand predictive capacity across ecosystems.

    more » « less
  3. Abstract

    Ecosystems around the globe are experiencing changes in both the magnitude and fluctuations of environmental conditions due to land use and climate change. In response, ecologists are increasingly using near‐term, iterative ecological forecasts to predict how ecosystems will change in the future. To date, many near‐term, iterative forecasting systems have been developed using high temporal frequency (minute to hourly resolution) data streams for assimilation. However, this approach may be cost‐prohibitive or impossible for forecasting ecological variables that lack high‐frequency sensors or have high data latency (i.e., a delay before data are available for modeling after collection). To explore the effects of data assimilation frequency on forecast skill, we developed water temperature forecasts for a eutrophic drinking water reservoir and conducted data assimilation experiments by selectively withholding observations to examine the effect of data availability on forecast accuracy. We used in situ sensors, manually collected data, and a calibrated water quality ecosystem model driven by forecasted weather data to generate future water temperature forecasts using Forecasting Lake and Reservoir Ecosystems (FLARE), an open source water quality forecasting system. We tested the effect of daily, weekly, fortnightly, and monthly data assimilation on the skill of 1‐ to 35‐day‐ahead water temperature forecasts. We found that forecast skill varied depending on the season, forecast horizon, depth, and data assimilation frequency, but overall forecast performance was high, with a mean 1‐day‐ahead forecast root mean square error (RMSE) of 0.81°C, mean 7‐day RMSE of 1.15°C, and mean 35‐day RMSE of 1.94°C. Aggregated across the year, daily data assimilation yielded the most skillful forecasts at 1‐ to 7‐day‐ahead horizons, but weekly data assimilation resulted in the most skillful forecasts at 8‐ to 35‐day‐ahead horizons. Within a year, forecasts with weekly data assimilation consistently outperformed forecasts with daily data assimilation after the 8‐day forecast horizon during mixed spring/autumn periods and 5‐ to 14‐day‐ahead horizons during the summer‐stratified period, depending on depth. Our results suggest that lower frequency data (i.e., weekly) may be adequate for developing accurate forecasts in some applications, further enabling the development of forecasts broadly across ecosystems and ecological variables without high‐frequency sensor data.

    more » « less
  4. Background:

    Short-term forecasts of infectious disease burden can contribute to situational awareness and aid capacity planning. Based on best practice in other fields and recent insights in infectious disease epidemiology, one can maximise the predictive performance of such forecasts if multiple models are combined into an ensemble. Here, we report on the performance of ensembles in predicting COVID-19 cases and deaths across Europe between 08 March 2021 and 07 March 2022.


    We used open-source tools to develop a public European COVID-19 Forecast Hub. We invited groups globally to contribute weekly forecasts for COVID-19 cases and deaths reported by a standardised source for 32 countries over the next 1–4 weeks. Teams submitted forecasts from March 2021 using standardised quantiles of the predictive distribution. Each week we created an ensemble forecast, where each predictive quantile was calculated as the equally-weighted average (initially the mean and then from 26th July the median) of all individual models’ predictive quantiles. We measured the performance of each model using the relative Weighted Interval Score (WIS), comparing models’ forecast accuracy relative to all other models. We retrospectively explored alternative methods for ensemble forecasts, including weighted averages based on models’ past predictive performance.


    Over 52 weeks, we collected forecasts from 48 unique models. We evaluated 29 models’ forecast scores in comparison to the ensemble model. We found a weekly ensemble had a consistently strong performance across countries over time. Across all horizons and locations, the ensemble performed better on relative WIS than 83% of participating models’ forecasts of incident cases (with a total N=886 predictions from 23 unique models), and 91% of participating models’ forecasts of deaths (N=763 predictions from 20 models). Across a 1–4 week time horizon, ensemble performance declined with longer forecast periods when forecasting cases, but remained stable over 4 weeks for incident death forecasts. In every forecast across 32 countries, the ensemble outperformed most contributing models when forecasting either cases or deaths, frequently outperforming all of its individual component models. Among several choices of ensemble methods we found that the most influential and best choice was to use a median average of models instead of using the mean, regardless of methods of weighting component forecast models.


    Our results support the use of combining forecasts from individual models into an ensemble in order to improve predictive performance across epidemiological targets and populations during infectious disease epidemics. Our findings further suggest that median ensemble methods yield better predictive performance more than ones based on means. Our findings also highlight that forecast consumers should place more weight on incident death forecasts than incident case forecasts at forecast horizons greater than 2 weeks.


    AA, BH, BL, LWa, MMa, PP, SV funded by National Institutes of Health (NIH) Grant 1R01GM109718, NSF BIG DATA Grant IIS-1633028, NSF Grant No.: OAC-1916805, NSF Expeditions in Computing Grant CCF-1918656, CCF-1917819, NSF RAPID CNS-2028004, NSF RAPID OAC-2027541, US Centers for Disease Control and Prevention 75D30119C05935, a grant from Google, University of Virginia Strategic Investment Fund award number SIF160, Defense Threat Reduction Agency (DTRA) under Contract No. HDTRA1-19-D-0007, and respectively Virginia Dept of Health Grant VDH-21-501-0141, VDH-21-501-0143, VDH-21-501-0147, VDH-21-501-0145, VDH-21-501-0146, VDH-21-501-0142, VDH-21-501-0148. AF, AMa, GL funded by SMIGE - Modelli statistici inferenziali per governare l'epidemia, FISR 2020-Covid-19 I Fase, FISR2020IP-00156, Codice Progetto: PRJ-0695. AM, BK, FD, FR, JK, JN, JZ, KN, MG, MR, MS, RB funded by Ministry of Science and Higher Education of Poland with grant 28/WFSN/2021 to the University of Warsaw. BRe, CPe, JLAz funded by Ministerio de Sanidad/ISCIII. BT, PG funded by PERISCOPE European H2020 project, contract number 101016233. CP, DL, EA, MC, SA funded by European Commission - Directorate-General for Communications Networks, Content and Technology through the contract LC-01485746, and Ministerio de Ciencia, Innovacion y Universidades and FEDER, with the project PGC2018-095456-B-I00. DE., MGu funded by Spanish Ministry of Health / REACT-UE (FEDER). DO, GF, IMi, LC funded by Laboratory Directed Research and Development program of Los Alamos National Laboratory (LANL) under project number 20200700ER. DS, ELR, GG, NGR, NW, YW funded by National Institutes of General Medical Sciences (R35GM119582; the content is solely the responsibility of the authors and does not necessarily represent the official views of NIGMS or the National Institutes of Health). FB, FP funded by InPresa, Lombardy Region, Italy. HG, KS funded by European Centre for Disease Prevention and Control. IV funded by Agencia de Qualitat i Avaluacio Sanitaries de Catalunya (AQuAS) through contract 2021-021OE. JDe, SMo, VP funded by Netzwerk Universitatsmedizin (NUM) project egePan (01KX2021). JPB, SH, TH funded by Federal Ministry of Education and Research (BMBF; grant 05M18SIA). KH, MSc, YKh funded by Project SaxoCOV, funded by the German Free State of Saxony. Presentation of data, model results and simulations also funded by the NFDI4Health Task Force COVID-19 ( within the framework of a DFG-project (LO-342/17-1). LP, VE funded by Mathematical and Statistical modelling project (MUNI/A/1615/2020), Online platform for real-time monitoring, analysis and management of epidemic situations (MUNI/11/02202001/2020); VE also supported by RECETOX research infrastructure (Ministry of Education, Youth and Sports of the Czech Republic: LM2018121), the CETOCOEN EXCELLENCE (CZ.02.1.01/0.0/0.0/17-043/0009632), RECETOX RI project (CZ.02.1.01/0.0/0.0/16-013/0001761). NIB funded by Health Protection Research Unit (grant code NIHR200908). SAb, SF funded by Wellcome Trust (210758/Z/18/Z).

    more » « less

    The Bureau of Reclamation (Reclamation) plays a central management role in the Colorado River Basin (CRB), with an increasing focus on meeting the needs of stakeholders during the current drought. One aspect of this role involves generating five‐year projections of reservoir operating conditions in the federal multi‐reservoir system. These projections are the basis for estimating the probability of shortage conditions, which are relied on by stakeholders, and are particularly important during drought. Currently, Ensemble Streamflow Prediction (ESP) forecasts drive Reclamation's Colorado River Mid‐term Modeling System to produce probabilistic reservoir projections to be used in risk‐based analysis and decision support for the first two years of the outlook period. The lack of significant forecast skill beyond the first year motivates interest in alternative forecasting approaches. The CRB Operational Prediction Testbed was created to provide a quantitative and consistent framework for assessing the skill of streamflow forecasts and their impact on associated reservoir system projections. Reservoir system projections are evaluated by analyzing Lakes Powell and Mead operations, including projected pool elevation and operating tiers. In an initial application of this testbed, ESP forecasts were compared to experimental streamflow forecasts to assess their skill impact on two‐year reservoir projections, which are critical information for managing drought.

    more » « less