skip to main content


Title: HydroBench: Jupyter supported reproducible hydrological model benchmarking and diagnostic tool

Evaluating whether hydrological models are right for the right reasons demands reproducible model benchmarking and diagnostics that evaluate not just statistical predictive model performance but also internal processes. Such model benchmarking and diagnostic efforts will benefit from standardized methods and ready-to-use toolkits. Using the Jupyter platform, this work presents HydroBench, a model-agnostic benchmarking tool consisting of three sets of metrics: 1) common statistical predictive measures, 2) hydrological signature-based process metrics, including a new time-linked flow duration curve and 3) information-theoretic diagnostics that measure the flow of information among model variables. As a test case, HydroBench was applied to compare two model products (calibrated and uncalibrated) of the National Hydrologic Model - Precipitation Runoff Modeling System (NHM-PRMS) at the Cedar River watershed, WA, United States. Although the uncalibrated model has the highest predictive performance, particularly for high flows, the signature-based diagnostics showed that the model overestimates low flows and poorly represents the recession processes. Elucidating why low flows may have been overestimated, the information-theoretic diagnostics indicated a higher flow of information from precipitation to snowmelt to streamflow in the uncalibrated model compared to the calibrated model, where information flowed more directly from precipitation to streamflow. This test case demonstrated the capability of HydroBench in process diagnostics and model predictive and functional performance evaluations, along with their tradeoffs. Having such a model benchmarking tool not only provides modelers with a comprehensive model evaluation system but also provides an open-source tool that can further be developed by the hydrological community.

 
more » « less
Award ID(s):
1928406
PAR ID:
10488044
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
Frontiers Media S.A.
Date Published:
Journal Name:
Frontiers in Earth Science
Volume:
10
ISSN:
2296-6463
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    How precipitation (P) is translated into streamflow (Q) and over what timescales (i.e., “memory”) is difficult to predict without calibration of site‐specific models or using geochemical approaches, posing barriers to prediction in ungauged basins or advancement of general theories. Here, we used a data‐driven approach to identify regional patterns and exogenous controls on P–Q interactions. We applied an information flow analysis, which quantifies uncertainty reduction, to a daily time series of P and Q from 671 watersheds across the conterminous United States. We first demonstrated that information transfer from P to Q primarily reflects the quickflow component of water‐budgets, based on a watershed model. Readily quantifiable information flows show a functional relationship with model parameters, suggesting utility for model calibration. Second, applied to real watersheds, P–Q information flows exhibit seasonally varying behavior within regions in a manner consistent with dominant runoff generation mechanisms. However, the timing and the magnitude of information flows also reflect considerable subregional heterogeneity, likely attributable to differences in watershed size, baseflow contributions, and variation in aerial coverage of preferential flow paths. A regression analysis showed that a combination of climate and watershed characteristics are predictive of P–Q information flows. Though information flows cannot, in most cases, uniquely determine dominant runoff mechanisms, they provide a means to quantify the heterogeneous outcomes of those mechanisms within regions, thereby serving as a benchmarking tool for models developed at the regional scale. Last, information flows characterize regionally specific ways in which catchment connectivity changes from the wet to dry season.

     
    more » « less
  2. Abstract

    Subsurface tile drainage (TD) is a dominant agriculture water management practice in the United States (US) to enhance crop production in poorly drained soils. Assessments of field‐level or watershed‐level (<50 km2) hydrologic impacts of TD are becoming common; however, a major gap exists in our understanding of regional (>105 km2) impacts of TD on hydrology. The National Water Model (NWM) is a distributed 1‐km resolution hydrological model designed to provide accurate streamflow forecasts at 2.7 million reaches across the US. The current NWM lacks TD representation which adds considerable uncertainty to streamflow forecasts in heavily tile‐drained areas. In this study, we quantify the performance of the NWM with a newly incorporated tile‐drainage scheme over the heavily tile‐drained Midwestern US. Employing a TD scheme enhanced the uncalibrated NWM performance by about 20–50% of the fully calibrated NWM (Calib). The calibrated NWM with tile drainage (CalibTD) showed enhanced accuracy with higher event hit rates and lower false alarm rates thanCalib.CalibTDshowed better performance in high‐flow estimations as TD increased streamflow peaks (14%), volume (2.3%), and baseflow (11%). Regional water balance analysis indicated that TD significantly reduced surface runoff (−7% to −29%), groundwater recharge (−43% to −50%), evapotranspiration (−7% to −13%), and soil moisture content (−2% to −3%). However, TD significantly increased soil profile lateral flow (27.7%) along with infiltration and soil water storage potential. Overall, our findings highlight the importance of incorporating the TD process into the operational configuration of the NWM.

     
    more » « less
  3. Summary

    Linear pooling is by far the most popular method for combining probability forecasts. However, any non-trivial weighted average of two or more distinct, calibrated probability forecasts is necessarily uncalibrated and lacks sharpness. In view of this, linear pooling requires recalibration, even in the ideal case in which the individual forecasts are calibrated. Towards this end, we propose a beta-transformed linear opinion pool for the aggregation of probability forecasts from distinct, calibrated or uncalibrated sources. The method fits an optimal non-linearly recalibrated forecast combination, by compositing a beta transform and the traditional linear opinion pool. The technique is illustrated in a simulation example and in a case-study on statistical and National Weather Service probability of precipitation forecasts.

     
    more » « less
  4. Abstract

    Hydrologic signatures are quantitative metrics that describe streamflow statistics and dynamics. Signatures have many applications, including assessing habitat suitability and hydrologic alteration, calibrating and evaluating hydrologic models, defining similarity between watersheds and investigating watershed processes. Increasingly, signatures are being used in large sample studies to guide flow management and modelling at continental scales. Using signatures in studies involving 1000s of watersheds brings new challenges as it becomes impractical to examine signature parameters and behaviour in each watershed. For example, we might wish to check that signatures describing flood event characteristics have correctly identified event periods, that signature values have not been biassed by data errors, or that human and natural influences on signature values have been correctly interpreted. In this commentary, we draw from our collective experience to present case studies where naïve application of signatures fails to correctly identify streamflow dynamics. These include unusual precipitation or flow regimes, data quality issues, and signature use in human‐influenced watersheds. We conclude by providing guidance and recommendations on applying signatures in large sample studies.

     
    more » « less
  5. Hydrologic signatures are quantitative metrics that describe streamflow statistics and dynamics. Signatures have many applications, including assessing habitat suitability and hydrologic alteration, calibrating and evaluating hydrologic models, defining similarity between watersheds and investigating watershed processes. Increasingly, signatures are being used in large sample studies to guide flow management and modelling at continental scales. Using signatures in studies involving 1000s of watersheds brings new challenges as it becomes impractical to examine signature parameters and behaviour in each watershed. For example, we might wish to check that signatures describing flood event characteristics have correctly identified event periods, that signature values have not been biassed by data errors, or that human and natural influences on signature values have been correctly interpreted. In this commentary, we draw from our collective experience to present case studies where naïve application of signatures fails to correctly identify streamflow dynamics. These include unusual precipitation or flow regimes, data quality issues, and signature use in human-influenced watersheds. We conclude by providing guidance and recommendations on applying signatures in large sample studies. 
    more » « less