NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Towards Entity-Aware Conditional Variational Inference for Heterogeneous Time-Series Prediction: An application to Hydrology

https://doi.org/10.1137/1.9781611978032.38

Ghosh, Rahul; Renganathan, Arvind; McAliley, Wallace; Steinbach, Michael; Duffy, Christopher; Kumar, Vipin (April 2024, SIAM International Conference on Data Mining (SDM24))

Many environmental systems (e.g., hydrology basins) can be modeled as an entity whose response (e.g., streamflow) depends on drivers (e.g., weather) conditioned on their characteristics (e.g., soil properties). We introduce Entity-aware Conditional Variational Inference (EA-CVI), a novel probabilistic inverse modeling approach, to deduce entity characteristics from observed driver-response data. EA-CVI infers probabilistic latent representations that can accurately predict responses for diverse entities, particularly in out-of-sample few-shot settings. EA-CVI's latent embeddings encapsulate diverse entity characteristics within compact, low-dimensional representations. EA-CVI proficiently identifies dominant modes of variation in responses and offers the opportunity to infer a physical interpretation of the underlying attributes that shape these responses. EA-CVI can also generate new data samples by sampling from the learned distribution, making it useful in zero-shot scenarios. EA-CVI addresses the need for uncertainty estimation, particularly during extreme events, rendering it essential for data-driven decision-making in real-world applications. Extensive evaluations on a renowned hydrology benchmark dataset, CAMELS-GB, validate EA-CVI's abilities.
more » « less
Full Text Available
Uncertainty Quantification in Inverse Models in Hydrology

Chatterjee, Somya Sharma; Ghosh, Rahul; Renganathan, Arvind; Li, Xiang; Chatterjee, Snigdhansu; Nieber, John; Duffy, Christopher; Kumar, Vipin (August 2023, ACM)

In hydrology, modeling streamflow remains a challenging task due to the limited availability of basin characteristics information such as soil geology and geomorphology. These characteristics may be noisy due to measurement errors or may be missing altogether. To overcome this challenge, we propose a knowledge-guided, probabilistic inverse modeling method for recovering physical characteristics from streamflow and weather data, which are more readily available. We compare our framework with state-of-the-art inverse models for estimating river basin characteristics. We also show that these estimates offer improvement in streamflow modeling as opposed to using the original basin characteristic values. Our inverse model offers a 3% improvement in R2 for the inverse model (basin characteristic estimation) and 6% for the forward model (streamflow prediction). Our framework also offers improved explainability since it can quantify uncertainty in both the inverse and the forward model. Uncertainty quantification plays a pivotal role in improving the explainability of machine learning models by providing additional insights into the reliability and limitations of model predictions. In our analysis, we assess the quality of the uncertainty estimates. Compared to baseline uncertainty quantification methods, our framework offers a 10% improvement in the dispersion of epistemic uncertainty and a 13% improvement in coverage rate. This information can help stakeholders understand the level of uncertainty associated with the predictions and provide a more comprehensive view of the potential outcomes.
more » « less
Full Text Available
Probabilistic Inverse Modeling: An Application in Hydrology

https://doi.org/10.1137/1.9781611977653.ch95

Sharma, Somya; Ghosh, Rahul; Renganathan, Arvind; Xiang, Li; Chatterjee, Snigdhansu; Nieber, John; Duffy, Christopher; Kumar, Vipin (April 2023, Proceedings of the 2023 SIAM International Conference on Data Mining (SDM))
Shekhar, Shashi; Zhou, Zhi-Hua; Chiang, Yao-Yi; Stiglic, Gregor (Ed.)
Rapid advancement in inverse modeling methods have brought into light their susceptibility to imperfect data. This has made it imperative to obtain more explainable and trustworthy estimates from these models. In hydrology, basin characteristics can be noisy or missing, impacting streamflow prediction. We propose a probabilistic inverse model framework that can reconstruct robust hydrology basin characteristics from dynamic input weather driver and streamflow response data. We address two aspects of building more explainable inverse models, uncertainty estimation (uncertainty due to imperfect data and imperfect model) and robustness. This can help improve the trust of water managers, handling of noisy data and reduce costs. We also propose an uncertainty based loss regularization that offers removal of 17% of temporal artifacts in reconstructions, 36% reduction in uncertainty and 4% higher coverage rate for basin characteristics. The forward model performance (streamflow estimation) is also improved by 6% using these uncertainty learning based reconstructions.
more » « less
Full Text Available
Mini-Batch Learning Strategies for modeling long term temporal dependencies: A study in environmental applications

https://doi.org/10.1137/1.9781611977653.ch73

Xu, Shaoming; Khandelwal, Ankush; Li, Xiang; Jia, Xiaowei; Liu, Licheng; Willard, Jared; Ghosh, Rahul; Cutler, Kelly; Steinbach, Michael; Duffy, Christopher; et al (April 2023, Proceedings of the 2023 SIAM International Conference on Data Mining (SDM))
Shekhar, Shashi; Zhou, Zhi-Hua; Chiang, Yao-Yi; Stiglic, Gregor (Ed.)
In many environmental applications, recurrent neural networks (RNNs) are often used to model physical variables with long temporal dependencies. However, due to minibatch training, temporal relationships between training segments within the batch (intra-batch) as well as between batches (inter-batch) are not considered, which can lead to limited performance. Stateful RNNs aim to address this issue by passing hidden states between batches. Since Stateful RNNs ignore intra-batch temporal dependency, there exists a trade-off between training stability and capturing temporal dependency. In this paper, we provide a quantitative comparison of different Stateful RNN modeling strategies, and propose two strategies to enforce both intra- and inter-batch temporal dependency. First, we extend Stateful RNNs by defining a batch as a temporally ordered set of training segments, which enables intra-batch sharing of temporal information. While this approach significantly improves the performance, it leads to much larger training times due to highly sequential training. To address this issue, we further propose a new strategy which augments a training segment with an initial value of the target variable from the timestep right before the starting of the training segment. In other words, we provide an initial value of the target variable as additional input so that the network can focus on learning changes relative to that initial value. By using this strategy, samples can be passed in any order (mini-batch training) which significantly reduces the training time while maintaining the performance. In demonstrating the utility of our approach in hydrological modeling, we observe that the most significant gains in predictive accuracy occur when these methods are applied to state variables whose values change more slowly, such as soil water and snowpack, rather than continuously moving flux variables such as streamflow.
more » « less
Full Text Available
Robust Inverse Framework using Knowledge-guided Self-Supervised Learning: An application to Hydrology

https://doi.org/10.1145/3534678.3539448

Ghosh, Rahul; Renganathan, Arvind; Tayal, Kshitij; Li, Xiang; Khandelwal, Ankush; Jia, Xiaowei; Duffy, Christopher; Nieber, John; Kumar, Vipin (August 2022, KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining)

Machine Learning is beginning to provide state-of-the-art performance in a range of environmental applications such as streamflow prediction in a hydrologic basin. However, building accurate broad-scale models for streamflow remains challenging in practice due to the variability in the dominant hydrologic processes, which are best captured by sets of process-related basin characteristics. Existing basin characteristics suffer from noise and uncertainty, among many other things, which adversely impact model performance. To tackle the above challenges, in this paper, we propose a novel Knowledge-guided Self-Supervised Learning (KGSSL) inverse framework to extract system characteristics from driver(input) and response(output) data. This first-of-its-kind framework achieves robust performance even when characteristics are corrupted or missing. We evaluate the KGSSL framework in the context of stream flow modeling using CAMELS (Catchment Attributes and MEteorology for Large-sample Studies) which is a widely used hydrology benchmark dataset. Specifically, KGSSL outperforms baseline by 16% in predicting missing characteristics. Furthermore, in the context of forward modelling, KGSSL inferred characteristics provide a 35% improvement in performance over a standard baseline when the static characteristic are unknown.
more » « less
Full Text Available
Lake thermal structure drives inter-annual variability in summer anoxia dynamics in a eutrophic lake over 37 years

https://doi.org/10.6073/pasta/418bf748dc2351f026c25111f7cbfd7e

Ladwig, Robert; Hanson, Paul C; Dugan, Hilary A; Carey, Cayelan C; Zhang, Yu; Shu, Lele; Duffy, Christopher; Cobourn, Kelly M (January 2021, Environmental Data Initiative)

Dataset to run a 37-year simulation (1979-2015) of the Lake Mendota lake ecosystem using the vertical 1D GLM-AED2 model. The focus of this modeling study is on determining the drivers of year-to-year variability in the spatial and temporal extent of hypolimnetic anoxia.
more » « less
Lake thermal structure drives interannual variability in summer anoxia dynamics in a eutrophic lake over 37 years

https://doi.org/10.5194/hess-25-1009-2021

Ladwig, Robert; Hanson, Paul C.; Dugan, Hilary A.; Carey, Cayelan C.; Zhang, Yu; Shu, Lele; Duffy, Christopher J.; Cobourn, Kelly M. (January 2021, Hydrology and Earth System Sciences)
null (Ed.)
Abstract. The concentration of oxygen is fundamental to lake water quality and ecosystem functioning through its control over habitat availability for organisms, redox reactions, and recycling of organic material. In many eutrophic lakes, oxygen depletion in the bottom layer (hypolimnion) occurs annually during summer stratification. The temporal and spatial extent of summer hypolimnetic anoxia is determined by interactions between the lake and its external drivers (e.g., catchment characteristics, nutrient loads, meteorology) as well as internal feedback mechanisms (e.g., organic matter recycling, phytoplankton blooms). How these drivers interact to control the evolution of lake anoxia over decadal timescales will determine, in part, the future lake water quality. In this study, we used a vertical one-dimensional hydrodynamic–ecological model (GLM-AED2) coupled with a calibrated hydrological catchment model (PIHM-Lake) to simulate the thermal and water quality dynamics of the eutrophic Lake Mendota (USA) over a 37 year period. The calibration and validation of the lake model consisted of a global sensitivity evaluation as well as the application of an optimization algorithm to improve the fit between observed and simulated data. We calculated stability indices (Schmidt stability, Birgean work, stored internal heat), identified spring mixing and summer stratification periods, and quantified the energy required for stratification and mixing. To qualify which external and internal factors were most important in driving the interannual variation in summer anoxia, we applied a random-forest classifier and multiple linear regressions to modeled ecosystem variables (e.g., stratification onset and offset, ice duration, gross primary production). Lake Mendota exhibited prolonged hypolimnetic anoxia each summer, lasting between 50–60 d. The summer heat budget, the timing of thermal stratification, and the gross primary production in the epilimnion prior to summer stratification were the most important predictors of the spatial and temporal extent of summer anoxia periods in Lake Mendota. Interannual variability in anoxia was largely driven by physical factors: earlier onset of thermal stratification in combination with a higher vertical stability strongly affected the duration and spatial extent of summer anoxia. A measured step change upward in summer anoxia in 2010 was unexplained by the GLM-AED2 model. Although the cause remains unknown, possible factors include invasion by the predacious zooplankton Bythotrephes longimanus. As the heat budget depended primarily on external meteorological conditions, the spatial and temporal extent of summer anoxia in Lake Mendota is likely to increase in the near future as a result of projected climate change in the region.
more » « less
Full Text Available
Understanding watershed hydrogeochemistry: 1. Development of RT-Flux-PIHM: DEVELOPMENT OF RT-FLUX-PIHM

https://doi.org/10.1002/2016WR018934

Bao, Chen; Li, Li; Shi, Yuning; Duffy, Christopher (March 2017, Water Resources Research)

Full Text Available
Fully-coupled hydrologic processes for modeling landscape evolution

https://doi.org/10.1016/j.envsoft.2016.04.014

Zhang, Yu; Slingerland, Rudy; Duffy, Christopher (August 2016, Environmental Modelling & Software)

Full Text Available
Virtual Experiments Guide Calibration Strategies for a Real-World Watershed Application of Coupled Surface-Subsurface Modeling

https://doi.org/10.1061/(ASCE)HE.1943-5584.0001431

Yu, Xuan; Duffy, Christopher; Zhang, Yu; Bhatt, Gopal; Shi, Yuning (November 2016, Journal of Hydrologic Engineering)

Full Text Available

« Prev Next »

Search for: All records