skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Using System‐Inspired Metrics to Improve Water Quality Prediction in Stratified Lakes
Abstract Despite the growing use of Aquatic Ecosystem Models for lake modeling, there is currently no widely applicable framework for their configuration, calibration, and evaluation. Calibration is generally based on direct data comparison of observed versus modeled state variables using standard statistical techniques, however, this approach may not give a complete picture of the model's ability to capture system‐scale behavior that is not easily perceivable in observations, but which may be important for resource management. The aim of this study is to compare the performance of “naïve” calibration and a “system‐inspired” calibration, an approach that augments the standard state‐based calibration with a range of system‐inspired metrics (e.g., thermocline depth, metalimnetic oxygen minima), to increase the coherence between the simulated and natural ecosystems. A coupled physical‐biogeochemical model was applied to a focal site to simulate two key state‐variables: water temperature and dissolved oxygen. The model was calibrated according to the new system‐inspired modeling convention, using formal calibration techniques. There was an improvement in the simulation using parameters optimized on the additional metrics, which helped to reduce uncertainty predicting aspects of the system relevant to reservoir management, such as the occurrence of the metalimnetic oxygen minima. Extending the use of system‐inspired metrics when calibrating models has the potential to improve model fidelity for capturing more complex ecosystem dynamics.  more » « less
Award ID(s):
2330211 1933016 2327030 2318861 2452117
PAR ID:
10542545
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
AGU
Date Published:
Journal Name:
Water Resources Research
Volume:
60
Issue:
8
ISSN:
0043-1397
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract. This paper studies how to improve the accuracy of hydrologic models using machine-learning models as post-processors and presents possibilities to reduce the workload to create an accurate hydrologic model by removing the calibration step. It is often challenging to develop an accurate hydrologic model due to the time-consuming model calibration procedure and the nonstationarity of hydrologic data. Our findings show that the errors of hydrologic models are correlated with model inputs. Thus motivated, we propose a modeling-error-learning-based post-processor framework by leveraging this correlation to improve the accuracy of a hydrologic model. The key idea is to predict the differences (errors) between the observed values and the hydrologic model predictions by using machine-learning techniques. To tackle the nonstationarity issue of hydrologic data, a moving-window-based machine-learning approach is proposed to enhance the machine-learning error predictions by identifying the local stationarity of the data using a stationarity measure developed based on the Hilbert–Huang transform. Two hydrologic models, the Precipitation–Runoff Modeling System (PRMS) and the Hydrologic Modeling System (HEC-HMS), are used to evaluate the proposed framework. Two case studies are provided to exhibit the improved performance over the original model using multiple statistical metrics. 
    more » « less
  2. This paper presents a comprehensive approach to predicting short-term (for the upcoming 2 weeks) changes in estuarine dissolved oxygen concentrations via machine learning models that integrate historical water sampling, historical and upcoming 2-week meteorological data, and river discharge and discharge metrics. Dissolved oxygen is a critical indicator of ecosystem health, and this approach is implemented for the Neuse River Estuary, North Carolina, U.S.A., which has a long history of hypoxia-related habitat degradation. Through meticulous data preprocessing and feature selection, this research evaluates the predictions of dissolved oxygen concentrations by comparing a recurrent neural network with four other models, including a Multilayer Perceptron, Long Short-Term Memory, Gradient Boosting, and AutoKeras, through sensitivity experiments. The input predictors to our prediction models include water temperature, turbidity, chlorophyll-a, aggregated river discharge, and aggregated wind based on eight directions. By emphasizing the most impactful predictors, we streamlined the model-building processes and built a hindcast system from 2015 to 2019. We found that the recurrent neural network model was most effective in predicting the dissolved oxygen concentrations, with an R2 value of 0.99 at multiple stations. Different from our machine learning hindcast models that used observed upcoming meteorological and discharge data, an actual forecast system would use forecasted meteorological and discharge data. Therefore, an actual operational forecast may have lower accuracy than the hindcast, as determined by the accuracy of the predicted meteorological and discharge data. Nevertheless, our studies enhance our understanding of the factors influencing dissolved oxygen variability and set the basis for the implementation of a predictive tool for environmental monitoring and management. We also emphasized the importance of building station-specific models to improve the prediction results. 
    more » « less
  3. A machine learning model is calibrated if its predicted probability for an outcome matches the observed frequency for that outcome conditional on the model prediction. This property has become increasingly important as the impact of machine learning models has continued to spread to various domains. As a result, there are now a dizzying number of recent papers on measuring and improving the calibration of (specifically deep learning) models. In this work, we reassess the reporting of calibration metrics in the recent literature. We show that there exist trivial recalibration approaches that can appear seemingly state-of-the-art unless calibration and prediction metrics (i.e. test accuracy) are accompanied by additional generalization metrics such as negative log-likelihood. We then use a calibration-based decomposition of Bregman divergences to develop a new extension to reliability diagrams that jointly visualizes calibration and generalization error, and show how our visualization can be used to detect trade-offs between calibration and generalization. Along the way, we prove novel results regarding the relationship between full calibration error and confidence calibration error for Bregman divergences. We also establish the consistency of the kernel regression estimator for calibration error used in our visualization approach, which generalizes existing consistency results in the literature. 
    more » « less
  4. Abstract Excessive algae growth can lead to negative consequences for ecosystem function, economic opportunity, and human and animal health. Due to the cost‐effectiveness and temporal availability of satellite imagery, remote sensing has become a powerful tool for water quality monitoring. The use of remotely sensed products to monitor water quality related to algae and cyanobacteria productivity during a bloom event may help inform management strategies for inland waters. To evaluate the ability of satellite imagery to monitor algae pigments and dissolved oxygen conditions in a small inland lake, chlorophyll‐a, phycocyanin, and dissolved oxygen concentrations are measured using a YSI EXO2 sonde during Sentinel‐2 and Sentinel‐3 overpasses from 2019 to 2022 on Lake Mendota, WI. Machine learning methods are implemented with existing algorithms to model chlorophyll‐a, phycocyanin, and Pc:Chla. A novel machine learning‐based dissolved oxygen modeling approach is developed using algae pigment concentrations as predictors. Best model results based on Sentinel‐2 (Sentinel‐3) imagery achieved R2scores of 0.47 (0.42) for chlorophyll‐a, 0.69 (0.22) for phycocyanin, and 0.70 (0.41) for Pc:Chla. Dissolved oxygen models achieved anR2of 0.68 (0.36) when applied to Sentinel‐2 (Sentinel‐3) imagery, and Pc:Chla is found to be the most important predictive feature. Random forest models are better suited to water quality estimations in this system given built in methods for feature selection and a relatively small data set. Use of these approaches for estimation of Pc:Chla and dissolved oxygen can increase the water quality information extracted from satellite imagery and improve characterization of algae conditions among inland waters. 
    more » « less
  5. Over the last decade, autocalibration routines have become commonplace in watershed modeling. This approach is most often used to simulate a streamflow at a basin’s outlet. In alpine settings, spring/early summer snowmelt is by far the dominant signal in this system. Therefore, there is great potential for a modeled watershed to underperform during other times of the year. This tendency has been noted in many prior studies. In this work, the Soil and Water Assessment Tool (SWAT) model was auto-calibrated with the SUFI-2 routine. A mountainous watershed from Idaho was examined (Upper North Fork). In this study, this basin was calibrated using three estimates of evapotranspiration (ET): Moderate Resolution Imagining Spectrometer (MODIS), Simplified Surface Energy Balance, and Global Land Evaporation: the Amsterdam Model. The MODIS product in particular, had the greatest utility in helping to constrain SWAT parameters that have a high sensitivity to ET. Streamflow simulations that utilize these ET parameter values have improved recessional and summertime streamflow performances during calibration (2007 to 2011) and validation (2012 to 2014) periods. Streamflow performance was monitored with standard objective metrics (Bias and Nash Sutcliffe coefficients) that quantified overall, recessional, and summertime peak flows. This approach yielded dramatic enhancements for all three observations. These results demonstrate the utility of this approach for improving watershed modeling fidelity outside the main snowmelt season. 
    more » « less