skip to main content


Title: Improving Hydrological Models With the Assimilation of Crowdsourced Data
Abstract

Small streams often lack reliable hydrological data. Environmental agencies play a key role in providing such data; however, these agencies are often challenged by the growing monitoring needs and lack of funding. Given the spatial mismatch between observed data and small watersheds/headwaters, local volunteers can act as potentially valuable research partners. We examine how CrowdHydrology, a citizen science program that collects stream stage and stream temperature observations, improves a hydrologic model of the Boyne River, Michigan, USA. Volunteers provided observations at four calibration sites with different interarrival times of the observations. We tested whether stream stage and stream temperature observations (measured by volunteers) improved the performance of a Soil and Water Assessment Tool (SWAT) model of the Boyne River. Observations were integrated into the model using the ensemble Kalman filter. This framework allowed us to integrate observation error, track the variability of model parameters, and simulate daily streamflow and stream temperature across the watershed. Measures of daily model performance included the Nash‐Sutcliffe efficiency, modified Nash‐Sutcliffe efficiency (Ef‐mod), refined index of agreement (dr), and relative bias (Bias). For all calibration sites, estimates of streamflow improved after data assimilation compared to simulations based on initial/default SWAT parameters. Different measures of model performance emerged based on the interarrival times of the observations. Results demonstrate that observations collected by local volunteers, with a certain temporal resolution, can improve SWAT hydrological models and capture central tendency.

 
more » « less
Award ID(s):
1661156 1661324
NSF-PAR ID:
10452084
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
DOI PREFIX: 10.1029
Date Published:
Journal Name:
Water Resources Research
Volume:
56
Issue:
5
ISSN:
0043-1397
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Hydrological models require complete and accurate weather data time series to represent watershed‐scale responses adequately. The Global Historical Climatology Network (GHCN) is the most comprehensive weather database used in hydrological modelling studies globally. Since higher‐density, lower‐reliability precipitation measurements from private citizens collected by the Community Collaborative Rain, Hail, and Snow (CoCoRaHS) network data were integrated into the GHCN, hydrological modellers in the United States have access to a much greater amount of weather data. However, the benefit of using CoCoRaHS data has not been assessed. The objectives of this work were to develop a method for generating a complete weather data time series based on the combination of data from multiple GHCN monitors and to assess several methods for the estimation of missing weather data. Weather data from GHCN monitors located within a specific radius of a watershed were obtained and interpolated using three estimation methods (Inverse Distance Weighting (IDW), Inverse Distance and Elevation Weighting (IDEW) and Closest Station), creating a seamless time series of weather observations. To evaluate the performance of the methodologies, weather data obtained from each estimation method was used to force the Soil and Water Assessment Tool (SWAT) and Thornthwaite‐Mather models for 21 US Department of Agriculture‐Conservation Effects Assessment Project watersheds in different climate regions to simulate daily streamflow for 2010–2021. Except for three watersheds, all of the SWAT models had Nash‐Sutcliffe Efficiency above 0.5, the ratio of the root mean square error to the standard deviation of observations below 0.7, and percent bias from −25% to 25% with a satisfactory performance rating. IDEW and IDW performed similarly, and the Closest Station method resulted in the poorest streamflow simulation. A comparison with published SWAT model results further corroborated improved model performance using novel combined GHCN data with all Closest Station, IDW and IDEW methods.

     
    more » « less
  2. Weather data are the key forces that drive hydrological processes so that their accuracy in watershed modeling is fundamentally important. For large-scale watershed modeling, weather data are either generated by using interpolation methods or derived from assimilated datasets. In the present study, we compared model performances of the Soil and Water Assessment Tool (SWAT), as driven by interpolation weather data, and NASA North American Land Data Assimilation System Phase Two (NLDAS2) weather dataset in the Upper Mississippi River Basin (UMRB). The SWAT model fed with different weather datasets were used to simulate monthly stream flow at 11 United States Geological Survey (USGS) monitoring stations in the UMRB. Model performances were evaluated based on three metrics: coefficient of determination (R2), Nash–Sutcliffe coefficient (NS), and percent bias (Pbias). The results show that, after calibration, the SWAT model compared well at all monitoring stations for monthly stream flow using different weather datasets indicating that the SWAT model can adequately produce long-term water yield in UMRB. The results also show that using NLDAS2 weather dataset can improve SWAT prediction of monthly stream flow with less prediction uncertainty in the UMRB. We concluded that NLDAS2 dataset could be used by the SWAT model for large-scale watersheds like UMRB as a surrogate of the interpolation weather data. Further analyses results show that NLDAS2 daily solar radiation data was about 2.5 MJ m−2 higher than the interpolation data. As such, the SWAT model driven by NLDAS2 dataset tended to underestimate stream flow in the UMRB due to the overestimation in evapotranspiration in uncalibrated conditions. Thus, the implication of overestimated solar radiation by NLDAS2 dataset should be considered before using NLDAS2 dataset to drive the hydrological model. 
    more » « less
  3. Over the last decade, autocalibration routines have become commonplace in watershed modeling. This approach is most often used to simulate a streamflow at a basin’s outlet. In alpine settings, spring/early summer snowmelt is by far the dominant signal in this system. Therefore, there is great potential for a modeled watershed to underperform during other times of the year. This tendency has been noted in many prior studies. In this work, the Soil and Water Assessment Tool (SWAT) model was auto-calibrated with the SUFI-2 routine. A mountainous watershed from Idaho was examined (Upper North Fork). In this study, this basin was calibrated using three estimates of evapotranspiration (ET): Moderate Resolution Imagining Spectrometer (MODIS), Simplified Surface Energy Balance, and Global Land Evaporation: the Amsterdam Model. The MODIS product in particular, had the greatest utility in helping to constrain SWAT parameters that have a high sensitivity to ET. Streamflow simulations that utilize these ET parameter values have improved recessional and summertime streamflow performances during calibration (2007 to 2011) and validation (2012 to 2014) periods. Streamflow performance was monitored with standard objective metrics (Bias and Nash Sutcliffe coefficients) that quantified overall, recessional, and summertime peak flows. This approach yielded dramatic enhancements for all three observations. These results demonstrate the utility of this approach for improving watershed modeling fidelity outside the main snowmelt season. 
    more » « less
  4. null (Ed.)
    Stream water temperature (Ts) is a variable of critical importance for aquatic ecosystem health. Ts is strongly affected by groundwater-surface water interactions which can be learned from streamflow records, but previously such information was challenging to effectively absorb with process-based models due to parameter equifinality. Based on the long short-term memory (LSTM) deep learning architecture, we developed a basin-centric lumped daily mean Ts model, which was trained over 118 data-rich basins with no major dams in the conterminous United States, and showed strong results. At a national scale, we obtained a median root-mean-square error (RMSE) of 0.69oC, Nash-Sutcliffe model efficiency coefficient (NSE) of 0.985, and correlation of 0.994, which are marked improvements over previous values reported in literature. The addition of streamflow observations as a model input strongly elevated the performance of this model. In the absence of measured streamflow, we showed that a two-stage model can be used where simulated streamflow from a pre-trained LSTM model (Qsim) still benefits the Ts model, even though no new information was brought directly in the inputs of the Ts model; the model indirectly used information learned from streamflow observations provided during the training of Qsim, potentially to improve internal representation of physically meaningful variables. Our results indicate that strong relationships exist between basin-averaged forcing variables, catchment attributes, and Ts that can be simulated by a single model trained by data on the continental scale. 
    more » « less
  5. null (Ed.)
    Precipitation occurs in two basic forms defined as liquid state and solid state. Different from rain-fed watershed, modeling snow processes is of vital importance in snow-dominated watersheds. The seasonal snowpack is a natural water reservoir, which stores snow water in winter and releases it in spring and summer. The warmer climate in recent decades has led to earlier snowmelt, a decline in snowpack, and change in the seasonality of river flows. The Soil and Water Assessment Tool (SWAT) could be applied in the snow-influenced watershed because of its ability to simultaneously predict the streamflow generated from rainfall and from the melting of snow. The choice of parameters, reference data, and calibration strategy could significantly affect the SWAT model calibration outcome and further affect the prediction accuracy. In this study, SWAT models are implemented in four upland watersheds in the Tulare Lake Basin (TLB) located across the Southern Sierra Nevada Mountains. Three calibration scenarios considering different calibration parameters and reference datasets are applied to investigate the impact of the Parallel Energy Balance Model (ParBal) snow reconstruction data and snow parameters on the streamflow and snow water-equivalent (SWE) prediction accuracy. In addition, the watershed parameters and lapse rate parameters-led equifinality is also evaluated. The results indicate that calibration of the SWAT model with respect to both streamflow and SWE reference data could improve the model SWE prediction reliability in general. Comparatively, the streamflow predictions are not significantly affected by differently lumped calibration schemes. The default snow parameter values capture the extreme high flows better than the other two calibration scenarios, whereas there is no remarkable difference among the three calibration schemes for capturing the extreme low flows. The watershed and lapse rate parameters-induced equifinality affects the flow prediction more (Nash-Sutcliffe Efficiency (NSE) varies between 0.2–0.3) than the SWE prediction (NSE varies less than 0.1). This study points out the remote-sensing-based SWE reconstruction product as a promising alternative choice for model calibration in ungauged snow-influenced watersheds. The streamflow-reconstructed SWE bi-objective calibrated model could improve the prediction reliability of surface water supply change for the downstream agricultural region under the changing climate. 
    more » « less