skip to main content


Title: Development of a Deep Learning Emulator for a Distributed Groundwater–Surface Water Model: ParFlow-ML
Integrated hydrologic models solve coupled mathematical equations that represent natural processes, including groundwater, unsaturated, and overland flow. However, these models are computationally expensive. It has been recently shown that machine leaning (ML) and deep learning (DL) in particular could be used to emulate complex physical processes in the earth system. In this study, we demonstrate how a DL model can emulate transient, three-dimensional integrated hydrologic model simulations at a fraction of the computational expense. This emulator is based on a DL model previously used for modeling video dynamics, PredRNN. The emulator is trained based on physical parameters used in the original model, inputs such as hydraulic conductivity and topography, and produces spatially distributed outputs (e.g., pressure head) from which quantities such as streamflow and water table depth can be calculated. Simulation results from the emulator and ParFlow agree well with average relative biases of 0.070, 0.092, and 0.032 for streamflow, water table depth, and total water storage, respectively. Moreover, the emulator is up to 42 times faster than ParFlow. Given this promising proof of concept, our results open the door to future applications of full hydrologic model emulation, particularly at larger scales.  more » « less
Award ID(s):
2040542 2019625
NSF-PAR ID:
10336010
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Water
Volume:
13
Issue:
23
ISSN:
2073-4441
Page Range / eLocation ID:
3393
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. While machine learning approaches are rapidly being applied to hydrologic problems, physics-informed approaches are still relatively rare. Many successful deep-learning applications have focused on point estimates of streamflow trained on stream gauge observations over time. While these approaches show promise for some applications, there is a need for distributed approaches that can produce accurate two-dimensional results of model states, such as ponded water depth. Here, we demonstrate a 2D emulator of the Tilted V catchment benchmark problem with solutions provided by the integrated hydrology model ParFlow. This emulator model can use 2D Convolution Neural Network (CNN), 3D CNN, and U-Net machine learning architectures and produces time-dependent spatial maps of ponded water depth from which hydrographs and other hydrologic quantities of interest may be derived. A comparison of different deep learning architectures and hyperparameters is presented with particular focus on approaches such as 3D CNN (that have a time-dependent learning component) and 2D CNN and U-Net approaches (that use only the current model state to predict the next state in time). In addition to testing model performance, we also use a simplified simulation based inference approach to evaluate the ability to calibrate the emulator to randomly selected simulations and the match between ML calibrated input parameters and underlying physics-based simulation. 
    more » « less
  2. Abstract

    This article presents a hydrological reconstruction of the Upper Colorado River Basin with an hourly temporal resolution, and 1-km spatial resolution from October 1982 to September 2019. The validated dataset includes a suite of hydrologic variables including streamflow, water table depth, snow water equivalent (SWE) and evapotranspiration (ET) simulated by an integrated hydrological model, ParFlow-CLM. The dataset was validated over the period with a combination of point observations and remotely sensed products. These datasets provide a long-term, natural-flow, simulation for one of the most over-allocated basins in the world.

     
    more » « less
  3. Abstract

    Integrated hydrological modeling is an effective method for understanding interactions between parts of the hydrologic cycle, quantifying water resources, and furthering knowledge of hydrologic processes. However, these models are dependent on robust and accurate datasets that physically represent spatial characteristics as model inputs. This study evaluates multiple data‐driven approaches for estimating hydraulic conductivity and subsurface properties at the continental‐scale, constructed from existing subsurface dataset components. Each subsurface configuration represents upper (unconfined) hydrogeology, lower (confined) hydrogeology, and the presence of a vertical flow barrier. Configurations are tested in two large‐scale U.S. watersheds using an integrated model. Model results are compared to observed streamflow and steady state water table depth (WTD). We provide model results for a range of configurations and show that both WTD and surface water partitioning are important indicators of performance. We also show that geology data source, total subsurface depth, anisotropy, and inclusion of a vertical flow barrier are the most important considerations for subsurface configurations. While a range of configurations proved viable, we provide a recommended Selected National Configuration 1 km resolution subsurface dataset for use in distributed large‐and continental‐scale hydrologic modeling.

     
    more » « less
  4. Abstract

    Several studies have focused on the importance of river bathymetry (channel geometry) in hydrodynamic routing along individual reaches. However, its effect on other watershed processes such as infiltration and surface water (SW)‐groundwater (GW) interactions has not been explored across large river networks. Surface and sbsurface processes are interdependent, therefore, errors due to inaccurate representation of one watershed process can cascade across other hydraulic or hydrologic processes. This study hypothesizes that accurate bathymetric representation is not only essential for simulating channel hydrodynamics but also affects subsurface processes by impacting SW‐GW interactions. Moreover, quantifying the effect of bathymetry on surface and subsurface hydrological processes across a river network can facilitate an improved understanding of how bathymetric characteristics affect these processes across large spatial domains. The study tests this hypothesis by developing physically based distributed models capable of bidirectional coupling (SW‐GW) with four configurations with progressively reduced levels of bathymetric representation. A comparison of hydrologic and hydrodynamic outputs shows that changes in channel geometry across the four configurations has a considerable effect on infiltration, lateral seepage, and location of water table across the entire river network. For example, when using bathymetry with inaccurate channel conveyance capacity but accurate channel depth, peak lateral seepage rate exhibited 58% error. The results from this study provide insights into the level of bathymetric detail required for accurately simulating flooding‐related physical processes while also highlighting potential issues with ignoring bathymetry across lower order streams such as spurious backwater flow, inaccurate water table elevations, and incorrect inundation extents.

     
    more » « less
  5. Thenkabail, Prasad S. (Ed.)

    Physically based hydrologic models require significant effort and extensive information for development, calibration, and validation. The study explored the use of the random forest regression (RFR), a supervised machine learning (ML) model, as an alternative to the physically based Soil and Water Assessment Tool (SWAT) for predicting streamflow in the Rio Grande Headwaters near Del Norte, a snowmelt-dominated mountainous watershed of the Upper Rio Grande Basin. Remotely sensed data were used for the random forest machine learning analysis (RFML) and RStudio for data processing and synthesizing. The RFML model outperformed the SWAT model in accuracy and demonstrated its capability in predicting streamflow in this region. We implemented a customized approach to the RFR model to assess the model’s performance for three training periods, across 1991–2010, 1996–2010, and 2001–2010; the results indicated that the model’s accuracy improved with longer training periods, implying that the model trained on a more extended period is better able to capture the parameters’ variability and reproduce streamflow data more accurately. The variable importance (i.e., IncNodePurity) measure of the RFML model revealed that the snow depth and the minimum temperature were consistently the top two predictors across all training periods. The paper also evaluated how well the SWAT model performs in reproducing streamflow data of the watershed with a conventional approach. The SWAT model needed more time and data to set up and calibrate, delivering acceptable performance in annual mean streamflow simulation, with satisfactory index of agreement (d), coefficient of determination (R2), and percent bias (PBIAS) values, but monthly simulation warrants further exploration and model adjustments. The study recommends exploring snowmelt runoff hydrologic processes, dust-driven sublimation effects, and more detailed topographic input parameters to update the SWAT snowmelt routine for better monthly flow estimation. The results provide a critical analysis for enhancing streamflow prediction, which is valuable for further research and water resource management, including snowmelt-driven semi-arid regions.

     
    more » « less