skip to main content


This content will become publicly available on April 29, 2024

Title: Probabilistic Inverse Modeling: An Application in Hydrology
Rapid advancement in inverse modeling methods have brought into light their susceptibility to imperfect data. This has made it imperative to obtain more explainable and trustworthy estimates from these models. In hydrology, basin characteristics can be noisy or missing, impacting streamflow prediction. We propose a probabilistic inverse model framework that can reconstruct robust hydrology basin characteristics from dynamic input weather driver and streamflow response data. We address two aspects of building more explainable inverse models, uncertainty estimation (uncertainty due to imperfect data and imperfect model) and robustness. This can help improve the trust of water managers, handling of noisy data and reduce costs. We also propose an uncertainty based loss regularization that offers removal of 17% of temporal artifacts in reconstructions, 36% reduction in uncertainty and 4% higher coverage rate for basin characteristics. The forward model performance (streamflow estimation) is also improved by 6% using these uncertainty learning based reconstructions.  more » « less
Award ID(s):
1934721
NSF-PAR ID:
10474532
Author(s) / Creator(s):
; ; ; ; ; ; ;
Editor(s):
Shekhar, Shashi; Zhou, Zhi-Hua; Chiang, Yao-Yi; Stiglic, Gregor
Publisher / Repository:
SIAM
Date Published:
Journal Name:
Proceedings of the 2023 SIAM International Conference on Data Mining (SDM)
Page Range / eLocation ID:
847-855
Format(s):
Medium: X
Location:
Minneapolis, MN
Sponsoring Org:
National Science Foundation
More Like this
  1. In hydrology, modeling streamflow remains a challenging task due to the limited availability of basin characteristics information such as soil geology and geomorphology. These characteristics may be noisy due to measurement errors or may be missing altogether. To overcome this challenge, we propose a knowledge-guided, probabilistic inverse modeling method for recovering physical characteristics from streamflow and weather data, which are more readily available. We compare our framework with state-of-the-art inverse models for estimating river basin characteristics. We also show that these estimates offer improvement in streamflow modeling as opposed to using the original basin characteristic values. Our inverse model offers a 3% improvement in R2 for the inverse model (basin characteristic estimation) and 6% for the forward model (streamflow prediction). Our framework also offers improved explainability since it can quantify uncertainty in both the inverse and the forward model. Uncertainty quantification plays a pivotal role in improving the explainability of machine learning models by providing additional insights into the reliability and limitations of model predictions. In our analysis, we assess the quality of the uncertainty estimates. Compared to baseline uncertainty quantification methods, our framework offers a 10% improvement in the dispersion of epistemic uncertainty and a 13% improvement in coverage rate. This information can help stakeholders understand the level of uncertainty associated with the predictions and provide a more comprehensive view of the potential outcomes. 
    more » « less
  2. Machine Learning is beginning to provide state-of-the-art performance in a range of environmental applications such as streamflow prediction in a hydrologic basin. However, building accurate broad-scale models for streamflow remains challenging in practice due to the variability in the dominant hydrologic processes, which are best captured by sets of process-related basin characteristics. Existing basin characteristics suffer from noise and uncertainty, among many other things, which adversely impact model performance. To tackle the above challenges, in this paper, we propose a novel Knowledge-guided Self-Supervised Learning (KGSSL) inverse framework to extract system characteristics from driver(input) and response(output) data. This first-of-its-kind framework achieves robust performance even when characteristics are corrupted or missing. We evaluate the KGSSL framework in the context of stream flow modeling using CAMELS (Catchment Attributes and MEteorology for Large-sample Studies) which is a widely used hydrology benchmark dataset. Specifically, KGSSL outperforms baseline by 16% in predicting missing characteristics. Furthermore, in the context of forward modelling, KGSSL inferred characteristics provide a 35% improvement in performance over a standard baseline when the static characteristic are unknown. 
    more » « less
  3. Abstract

    A mountain watershed network model is presented for use in decadal to centurial estimation of source‐to‐sink sediment dynamics. The model requires limited input parameters and can be effectively applied over spatial scales relevant to management of reservoirs, lakes, streams, and watersheds (1–100 km2). The model operates over a connected stream network of Strahler‐ordered segments. The model is driven by streamflow from a physically based hydrology model and hillslope sediment supply from a stochastic mass wasting algorithm. For each daily time step, segment‐scale sediment mass balance is computed using bedload and suspended load transport equations. Sediment transport is partitioned between grain size fractions for bedload as gravel and sand, and for suspended load as sand and mud. Bedload and suspended load can deposit and re‐entrain at each segment. We demonstrated the model in the Elwha River Basin, upstream of the former Glines Canyon dam, over the dam's historic 84‐year lifespan. The model predicted the lifetime reservoir sedimentation volume within the uncertainty range of the measured volume (13.7–18.5 million m3) for 25 of 28 model instances. Gravel, sand, and mud fraction volumes were predicted within measurement uncertainty ranges for 18 model instances. The network model improved the prediction of sediment yields compared to at‐a‐station sediment transport capacity relations. The network model also provided spatially and temporally distributed information that allowed for inquiry and understanding of the physical system beyond the sediment yields at the outlet. This work advances cross‐disciplinary and application‐oriented watershed sediment yield modeling approaches.

     
    more » « less
  4. Abstract

    Numerous studies have examined the reliability of various precipitation products over the Mekong River Basin (MRB) and modeled its basin hydrology. However, there is a lack of comprehensive studies on precipitation‐induced uncertainties in hydrological simulations using process‐based land surface models. This study examines the propagation of precipitation uncertainty into hydrological simulations over the entire MRB using the Community Land Model version 5 (CLM5) at a high spatial resolution of 0.05° (∼5 km) and without any parameter calibration. Simulations conducted using different precipitation datasets are compared to investigate the discrepancies in streamflow, terrestrial water storage (TWS), soil moisture, and evapotranspiration (ET) caused by precipitation uncertainty. Results indicate that precipitation is a key determinant of simulated streamflow in the MRB; peak flow and soil moisture are particularly sensitive to precipitation input. Further, precipitation data with a higher spatial resolution did not improve the simulations, contrary to the common perception that using meteorological forcing with higher spatial resolution would improve hydrological simulations. In addition, since high flow indicators are particularly influenced by precipitation data, the choice of precipitation data could directly impact flood pulse simulations in the MRB. Notable differences are also found among TWS, soil moisture, and ET simulated using different precipitation products. Moreover, TWS, soil moisture, and ET exhibit a varying degree of sensitivity to precipitation uncertainty. This study provides crucial insights on precipitation‐induced uncertainties in process‐based hydrological modeling and uncovers these uncertainties in the MRB.

     
    more » « less
  5. Abstract

    Precipitation is the primary driver of hydrological models, and its spatial and temporal variability have a great impact on water partitioning. However, in data‐sparse regions, uncertainty in precipitation estimates is high and the sensitivity of water partitioning to this uncertainty is unknown. This is a particular challenge in drylands (semi‐arid and arid regions) where the water balance is highly sensitive to rainfall, yet there is commonly a lack of in situ rain gauge data. To understand the impact of precipitation uncertainty on the water balance in drylands, here we have performed simulations with a process‐based hydrological model developed to characterize the water balance in arid and semi‐arid regions (DRYP: DRYland water Partitioning model). We performed a series of numerical analyses in the Upper Ewaso Ng'iro basin, Kenya driven by three gridded precipitation datasets with different spatio‐temporal resolutions (IMERG, MSWEP, and ERA5), evaluating simulations against streamflow observations and remotely sensed data products of soil moisture, actual evapotranspiration, and total water storage. We found that despite the great differences in the spatial distribution of rainfall across a climatic gradient within the basin, DRYP shows good performance for representing streamflow (KGE >0.6), soil moisture, actual evapotranspiration, and total water storage (r >0.5). However, the choice of precipitation datasets greatly influences surface (infiltration, runoff, and transmission losses) and subsurface fluxes (groundwater recharge and discharge) across different climatic zones of the Ewaso Ng'iro basin. Within humid areas, evapotranspiration does not show sensitivity to the choice of precipitation dataset, however, in dry lowland areas it becomes more sensitive to precipitation rates as water‐limited conditions develop. The analysis shows that the highest rates of precipitation produce high rates of diffuse recharge in Ewaso uplands and also propagate into runoff, transmission losses and, ultimately focused recharge, with the latter acting as the main mechanism of groundwater recharge in low dry areas. The results from this modelling exercise suggest that care must be taken in selecting forcing precipitation data to drive hydrological modelling efforts, especially in basins that span a climatic gradient. These results also suggest that more effort is required to reduce uncertainty between different precipitation datasets, which will in turn result in more consistent quantification of the water balance.

     
    more » « less