skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Enhancing streamflow forecast and extracting insights using long‐short term memory networks with data integration at continental scales
Recent observations with varied schedules and types (moving average, snapshot, or regularly spaced) can help to improve streamflow forecasts, but it is challenging to integrate them effectively. Based on a long short‐term memory (LSTM) streamflow model, we tested multiple versions of a flexible procedure we call data integration (DI) to leverage recent discharge measurements to improve forecasts. DI accepts lagged inputs either directly or through a convolutional neural network (CNN) unit. DI ubiquitously elevated streamflow forecast performance to unseen levels, reaching a record continental‐scale median Nash‐Sutcliffe Efficiency coefficient value of 0.86. Integrating moving‐average discharge, discharge from the last few days, or even average discharge from the previous calendar month could all improve daily forecasts. Directly using lagged observations as inputs was comparable in performance to using the CNN unit. Importantly, we obtained valuable insights regarding hydrologic processes impacting LSTM and DI performance. Before applying DI, the base LSTM model worked well in mountainous or snow‐dominated regions, but less well in regions with low discharge volumes (due to either low precipitation or high precipitation‐energy synchronicity) and large inter‐annual storage variability. DI was most beneficial in regions with high flow autocorrelation: it greatly reduced baseflow bias in groundwater‐dominated western basins and also improved peak prediction for basins with dynamical surface water storage, such as the Prairie Potholes or Great Lakes regions. However, even DI cannot elevate high‐aridity basins with one‐day flash peaks. Despite this limitation, there is much promise for a deep‐learning‐based forecast paradigm due to its performance, automation, efficiency, and flexibility.  more » « less
Award ID(s):
1832294
PAR ID:
10166420
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Water Resources Research
ISSN:
0043-1397
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract The Ensemble Streamflow Prediction (ESP) framework combines a probabilistic forecast structure with process‐based models for water supply predictions. However, process‐based models require computationally intensive parameter estimation, increasing uncertainties and limiting usability. Motivated by the strong performance of deep learning models, we seek to assess whether the Long Short‐Term Memory (LSTM) model can provide skillful forecasts and replace process‐based models within the ESP framework. Given challenges inimplicitlycapturing snowpack dynamics within LSTMs for streamflow prediction, we also evaluated the added skill ofexplicitlyincorporating snowpack information to improve hydrologic memory representation. LSTM‐ESPs were evaluated under four different scenarios: one excluding snow and three including snow with varied snowpack representations. The LSTM models were trained using information from 664 GAGES‐II basins during WY1983–2000. During a testing period, WY2001–2010, 80% of basins exhibited Nash‐Sutcliffe Efficiency (NSE) above 0.5 with a median NSE of around 0.70, indicating satisfactory utility in simulating seasonal water supply. LSTM‐ESP forecasts were then tested during WY2011–2020 over 76 western US basins with operational Natural Resources Conservation Services (NRCS) forecasts. A key finding is that in high snow regions, LSTM‐ESP forecasts using simplified ablation assumptions performed worse than those excluding snow, highlighting that snow data do not consistently improve LSTM‐ESP performance. However, LSTM‐ESP forecasts that explicitly incorporated past years' snow accumulation and ablation performed comparably to NRCS forecasts and better than forecasts excluding snow entirely. Overall, integrating deep learning within an ESP framework shows promise and highlights important considerations for including snowpack information in forecasting. 
    more » « less
  2. Abstract Snowpack provides the majority of predictive information for water supply forecasts (WSFs) in snow-dominated basins across the western United States. Drought conditions typically accompany decreased snowpack and lowered runoff efficiency, negatively impacting WSFs. Here, we investigate the relationship between snow water equivalent (SWE) and April–July streamflow volume (AMJJ-V) during drought in small headwater catchments, using observations from 31 USGS streamflow gauges and 54 SNOTEL stations. A linear regression approach is used to evaluate forecast skill under different historical climatologies used for model fitting, as well as with different forecast dates. Experiments are constructed in which extreme hydrological drought years are withheld from model training, that is, years with AMJJ-V below the 15th percentile. Subsets of the remaining years are used for model fitting to understand how the climatology of different training subsets impacts forecasts of extreme drought years. We generally report overprediction in drought years. However, training the forecast model on drier years, that is, below-median years ( P 15 , P 57.5 ], minimizes residuals by an average of 10% in drought year forecasts, relative to a baseline case, with the highest median skill obtained in mid- to late April for colder regions. We report similar findings using a modified National Resources Conservation Service (NRCS) procedure in nine large Upper Colorado River basin (UCRB) basins, highlighting the importance of the snowpack–streamflow relationship in streamflow predictability. We propose an “adaptive sampling” approach of dynamically selecting training years based on antecedent SWE conditions, showing error reductions of up to 20% in historical drought years relative to the period of record. These alternate training protocols provide opportunities for addressing the challenges of future drought risk to water supply planning. Significance Statement Seasonal water supply forecasts based on the relationship between peak snowpack and water supply exhibit unique errors in drought years due to low snow and streamflow variability, presenting a major challenge for water supply prediction. Here, we assess the reliability of snow-based streamflow predictability in drought years using a fixed forecast date or fixed model training period. We critically evaluate different training protocols that evaluate predictive performance and identify sources of error during historical drought years. We also propose and test an “adaptive sampling” application that dynamically selects training years based on antecedent SWE conditions providing to overcome persistent errors and provide new insights and strategies for snow-guided forecasts. 
    more » « less
  3. Abstract Streamflow forecasting at a subseasonal time scale (10–30 days into the future) is important for various human activities. The ensemble streamflow prediction (ESP) is a widely applied technique for subseasonal streamflow forecasting. However, ESP’s reliance on the randomly resampled historical precipitation limits its predictive capability. Available dynamical subseasonal precipitation forecasts provide an alternative to the randomly resampled precipitation in ESP. Prior studies found the predictive performance of raw subseasonal precipitation forecast is limited in many regions such as the central south of the United States, which raises questions about its effectiveness in assisting streamflow forecasting. To further assess the hydrologic applicability of dynamical subseasonal precipitation forecasts, we test the subseasonal precipitation forecast from North America Multi-Model Ensemble Phase II (NMME-2) at four watersheds in the central south region of the United States. The subseasonal precipitation forecasts are postprocessed with bias correction and spatial disaggregation (BCSD) to correct bias and improve spatial resolution before replacing the randomly resampled precipitation in ESP for streamflow predictions. The performance of the resulting streamflow predictions is benchmarked with ESP. Evaluation is conducted using Kling–Gupta Efficiency (KGE), continuous ranked probability score (CRPS), probability of detection (POD), false alarm ratios (FARs), as well as reliability diagrams. Our results suggest that BCSD-corrected subseasonal precipitation forecasts lead to overall improved streamflow predictions due to added skills in winter and spring. Our results also suggest that BCSD-corrected subseasonal precipitation forecasts lead to improved predictions on the occurrence of high-percentile streamflow values above 75%. Overall, BCSD-corrected subseasonal precipitation has shown promising performance, highlighting its potential broader applications for river and flood forecasting. 
    more » « less
  4. Several studies have demonstrated the ability of long short-term memory (LSTM) machine-learning-based modeling to outperform traditional spatially lumped process-based modeling approaches for streamflow prediction. However, due mainly to the structural complexity of the LSTM network (which includes gating operations and sequential processing of the data), difficulties can arise when interpreting the internal processes and weights in the model. Here, we propose and test a modification of LSTM architecture that is calibrated in a manner that is analogous to a hydrological system. Our architecture, called “HydroLSTM”, simulates the sequential updating of the Markovian storage while the gating operation has access to historical information. Specifically, we modify how data are fed to the new representation to facilitate simultaneous access to past lagged inputs and consolidated information, which explicitly acknowledges the importance of trends and patterns in the data. We compare the performance of the HydroLSTM and LSTM architectures using data from 10 hydro-climatically varied catchments. We further examine how the new architecture exploits the information in lagged inputs, for 588 catchments across the USA. The HydroLSTM-based models require fewer cell states to obtain similar performance to their LSTM-based counterparts. Further, the weight patterns associated with lagged input variables are interpretable and consistent with regional hydroclimatic characteristics (snowmelt-dominated, recent rainfall-dominated, and historical rainfall-dominated). These findings illustrate how the hydrological interpretability of LSTM-based models can be enhanced by appropriate architectural modifications that are physically and conceptually consistent with our understanding of the system. 
    more » « less
  5. Abstract. As a genre of physics-informed machine learning, differentiable process-based hydrologic models (abbreviated as δ or delta models) with regionalized deep-network-based parameterization pipelines were recently shown to provide daily streamflow prediction performance closely approaching that of state-of-the-art long short-term memory (LSTM) deep networks. Meanwhile, δ models provide a full suite of diagnostic physical variables and guaranteed mass conservation. Here, we ran experiments to test (1) their ability to extrapolate to regions far from streamflow gauges and (2) their ability to make credible predictions of long-term (decadal-scale) change trends. We evaluated the models based on daily hydrograph metrics (Nash–Sutcliffe model efficiency coefficient, etc.) and predicted decadal streamflow trends. For prediction in ungauged basins (PUB; randomly sampled ungauged basins representing spatial interpolation), δ models either approached or surpassed the performance of LSTM in daily hydrograph metrics, depending on the meteorological forcing data used. They presented a comparable trend performance to LSTM for annual mean flow and high flow but worse trends for low flow. For prediction in ungauged regions (PUR; regional holdout test representing spatial extrapolation in a highly data-sparse scenario), δ models surpassed LSTM in daily hydrograph metrics, and their advantages in mean and high flow trends became prominent. In addition, an untrained variable, evapotranspiration, retained good seasonality even for extrapolated cases. The δ models' deep-network-based parameterization pipeline produced parameter fields that maintain remarkably stable spatial patterns even in highly data-scarce scenarios, which explains their robustness. Combined with their interpretability and ability to assimilate multi-source observations, the δ models are strong candidates for regional and global-scale hydrologic simulations and climate change impact assessment. 
    more » « less