skip to main content


Title: The suitability of differentiable, physics-informed machine learning hydrologic models for ungauged regions and climate change impact assessment
Abstract. As a genre of physics-informed machine learning, differentiable process-based hydrologic models (abbreviated as δ or delta models) with regionalized deep-network-based parameterization pipelines were recently shown to provide daily streamflow prediction performance closely approaching that of state-of-the-art long short-term memory (LSTM) deep networks. Meanwhile, δ models provide a full suite of diagnostic physical variables and guaranteed mass conservation. Here, we ran experiments to test (1) their ability to extrapolate to regions far from streamflow gauges and (2) their ability to make credible predictions of long-term (decadal-scale) change trends. We evaluated the models based on daily hydrograph metrics (Nash–Sutcliffe model efficiency coefficient, etc.) and predicted decadal streamflow trends. For prediction in ungauged basins (PUB; randomly sampled ungauged basins representing spatial interpolation), δ models either approached or surpassed the performance of LSTM in daily hydrograph metrics, depending on the meteorological forcing data used. They presented a comparable trend performance to LSTM for annual mean flow and high flow but worse trends for low flow. For prediction in ungauged regions (PUR; regional holdout test representing spatial extrapolation in a highly data-sparse scenario), δ models surpassed LSTM in daily hydrograph metrics, and their advantages in mean and high flow trends became prominent. In addition, an untrained variable, evapotranspiration, retained good seasonality even for extrapolated cases. The δ models' deep-network-based parameterization pipeline produced parameter fields that maintain remarkably stable spatial patterns even in highly data-scarce scenarios, which explains their robustness. Combined with their interpretability and ability to assimilate multi-source observations, the δ models are strong candidates for regional and global-scale hydrologic simulations and climate change impact assessment.  more » « less
Award ID(s):
2221880 1832294 1940190
NSF-PAR ID:
10437869
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Hydrology and Earth System Sciences
Volume:
27
Issue:
12
ISSN:
1607-7938
Page Range / eLocation ID:
2357 to 2373
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Predicting discharge in contiguously data‐scarce or ungauged regions is needed for quantifying the global hydrologic cycle. We show that prediction in ungauged regions (PUR) has major, underrecognized uncertainty and is drastically more difficult than previous problems where basins can be represented by neighboring or similar basins (known as prediction in ungauged basins). While deep neural networks demonstrated stellar performance for streamflow predictions, performance nonetheless declined for PUR, benchmarked here with a new stringent region‐based holdout test on a US data set with 671 basins. We tested approaches to reduce such errors, leveraging deep network's flexibility to integrate “soft” data, such as satellite‐based soil moisture product, or daily flow distributions which improved low flow simulations. A novel input‐selection ensemble improved average performance and greatly reduced catastrophic failures. Despite challenges, deep networks showed stronger performance metrics for PUR than traditional hydrologic models. They appear competitive for geoscientific modeling even in data‐scarce settings.

     
    more » « less
  2. Abstract

    Predictions of hydrologic variables across the entire water cycle have significant value for water resources management as well as downstream applications such as ecosystem and water quality modeling. Recently, purely data‐driven deep learning models like long short‐term memory (LSTM) showed seemingly insurmountable performance in modeling rainfall runoff and other geoscientific variables, yet they cannot predict untrained physical variables and remain challenging to interpret. Here, we show that differentiable, learnable, process‐based models (calledδmodels here) can approach the performance level of LSTM for the intensively observed variable (streamflow) with regionalized parameterization. We use a simple hydrologic model HBV as the backbone and use embedded neural networks, which can only be trained in a differentiable programming framework, to parameterize, enhance, or replace the process‐based model's modules. Without using an ensemble or post‐processor,δmodels can obtain a median Nash‐Sutcliffe efficiency of 0.732 for 671 basins across the USA for the Daymet forcing data set, compared to 0.748 from a state‐of‐the‐art LSTM model with the same setup. For another forcing data set, the difference is even smaller: 0.715 versus 0.722. Meanwhile, the resulting learnable process‐based models can output a full set of untrained variables, for example, soil and groundwater storage, snowpack, evapotranspiration, and baseflow, and can later be constrained by their observations. Both simulated evapotranspiration and fraction of discharge from baseflow agreed decently with alternative estimates. The general framework can work with models with various process complexity and opens up the path for learning physics from big data.

     
    more » « less
  3. null (Ed.)
    Basin-centric long short-term memory (LSTM) network models have recently been shown to be an exceptionally powerful tool for stream temperature (Ts) temporal prediction (training in one period and making predictions for another period at the same sites). However, spatial extrapolation is a well-known challenge to modeling Ts and it is uncertain how an LSTM-based daily Ts model will perform in unmonitored or dammed basins. Here we compiled a new benchmark dataset consisting of >400 basins across the contiguous United States in different data availability groups (DAG, meaning the daily sampling frequency) with or without major dams and studied how to assemble suitable training datasets for predictions in basins with or without temperature monitoring. For prediction in unmonitored basins (PUB), LSTM produced an RMSE of 1.129 °C and R2 of 0.983. While these metrics declined from LSTM's temporal prediction performance, they far surpassed traditional models' PUB values, and were competitive with traditional models' temporal prediction on calibrated sites. Even for unmonitored basins with major reservoirs, we obtained a median RMSE of 1.202°C and an R2 of 0.984. For temporal prediction, the most suitable training set was the matching DAG that the basin could be grouped into, e.g., the 60% DAG for a basin with 61% data availability. However, for PUB, a training dataset including all basins with data is consistently preferred. An input-selection ensemble moderately mitigated attribute overfitting. Our results indicate there are influential latent processes not sufficiently described by the inputs (e.g., geology, wetland covers), but temporal fluctuations are well predictable, and LSTM appears to be a highly accurate Ts modeling tool even for spatial extrapolation. 
    more » « less
  4. Abstract

    Recent observations with varied schedules and types (moving average, snapshot, or regularly spaced) can help to improve streamflow forecasts, but it is challenging to integrate them effectively. Based on a long short‐term memory (LSTM) streamflow model, we tested multiple versions of a flexible procedure we call data integration (DI) to leverage recent discharge measurements to improve forecasts. DI accepts lagged inputs either directly or through a convolutional neural network unit. DI ubiquitously elevated streamflow forecast performance to unseen levels, reaching a record continental‐scale median Nash‐Sutcliffe Efficiency coefficient value of 0.86. Integrating moving‐average discharge, discharge from the last few days, or even average discharge from the previous calendar month could all improve daily forecasts. Directly using lagged observations as inputs was comparable in performance to using the convolutional neural network unit. Importantly, we obtained valuable insights regarding hydrologic processes impacting LSTM and DI performance. Before applying DI, the base LSTM model worked well in mountainous or snow‐dominated regions, but less well in regions with low discharge volumes (due to either low precipitation or high precipitation‐energy synchronicity) and large interannual storage variability. DI was most beneficial in regions with high flow autocorrelation: it greatly reduced baseflow bias in groundwater‐dominated western basins and also improved peak prediction for basins with dynamical surface water storage, such as the Prairie Potholes or Great Lakes regions. However, even DI cannot elevate performance in high‐aridity basins with 1‐day flash peaks. Despite this limitation, there is much promise for a deep‐learning‐based forecast paradigm due to its performance, automation, efficiency, and flexibility.

     
    more » « less
  5. Abstract

    This study examines whether deep learning models can produce reliable future projections of streamflow under warming. We train a regional long short‐term memory network (LSTM) to daily streamflow in 15 watersheds in California and develop three process models (HYMOD, SAC‐SMA, and VIC) as benchmarks. We force all models with scenarios of warming and assess their hydrologic response, including shifts in the hydrograph and total runoff ratio. All process models show a shift to more winter runoff, reduced summer runoff, and a decline in the runoff ratio due to increased evapotranspiration. The LSTM predicts similar hydrograph shifts but in some watersheds predicts an unrealistic increase in the runoff ratio. We then test two alternative versions of the LSTM in which process model outputs are used as either additional training targets (i.e., multi‐output LSTM) or input features. Results indicate that the multi‐output LSTM does not correct the unrealistic streamflow projections under warming. The hybrid LSTM using estimates of evapotranspiration from SAC‐SMA as an additional input feature produces more realistic streamflow projections, but this does not hold for VIC or HYMOD. This suggests that the hybrid method depends on the fidelity of the process model. Finally, we test climate change responses under an LSTM trained to over 500 watersheds across the United States and find more realistic streamflow projections under warming. Ultimately, this work suggests that hybrid modeling may support the use of LSTMs for hydrologic projections under climate change, but so may training LSTMs to a large, diverse set of watersheds.

     
    more » « less