skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Quantifying uncertainty in annual runoff due to missing data
Long-term streamflow datasets inevitably include gaps, which must be filled to allow estimates of runoff and ultimately catchment water budgets. Uncertainty introduced by filling gaps in discharge records is rarely, if ever, reported. We characterized the uncertainty due to streamflow gaps in a reference watershed at the Hubbard Brook Experimental Forest (HBEF) from 1996 to 2009 by simulating artificial gaps of varying duration and flow rate, with the objective of quantifying their contribution to uncertainty in annual streamflow. Gaps were filled using an ensemble of regressions relating discharge from nearby streams, and the predicted flow was compared to the actual flow. Differences between the predicted and actual runoff increased with both gap length and flow rate, averaging 2.8% of the runoff during the gap. At the HBEF, the sum of gaps averaged 22 days per year, with the lowest and highest annual uncertainties due to gaps ranging from 1.5 mm (95% confidence interval surrounding mean runoff) to 21.1 mm. As a percentage of annual runoff, uncertainty due to gap filling ranged from 0.2–2.1%, depending on the year. Uncertainty in annual runoff due to gaps was small at the HBEF, where infilling models are based on multiple similar catchments in close proximity to the catchment of interest. The method demonstrated here can be used to quantify uncertainty due to gaps in any long-term streamflow data set, regardless of the gap-filling model applied.  more » « less
Award ID(s):
1637685
PAR ID:
10214752
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
PeerJ
Volume:
8
ISSN:
2167-8359
Page Range / eLocation ID:
e9531
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Hydrologic connectivity refers to the processes and thresholds leading to water transport across a landscape. In dryland ecosystems, runoff production is mediated by the arrangement of vegetation and bare soil patches on hillslopes and the properties of ephemeral channels. In this study, we used runoff measurements at multiple scales in a small (4.67 ha) mixed shrubland catchment of the Chihuahuan Desert to identify controls on and thresholds of hillslope‐channel connectivity. By relating short‐ and long‐term hydrologic records, we also addressed whether observed changes in outlet discharge since 1977 were linked to modifications in hydrologic connectivity. Hillslope runoff production was controlled by the maximum rainfall intensity occurring in a 30‐min interval (I30), with small‐to‐negligible effects of antecedent surface soil moisture, vegetation cover, or slope aspect. AnI30threshold of nearly 10 mm/h activated runoff propagation from the shrubland hillslopes and through the main ephemeral channel, whereas anI30threshold of about 16 mm/h was required for discharge from the catchment outlet. Since storms rarely exceedI30, full hillslope‐channel connectivity occurs infrequently in the mixed shrubland, leading to <2% of the annual precipitation being converted into outlet discharge. Progressive decreases in outlet discharge since 1977 could not be explained by variations in precipitation metrics, includingI30, or the process of woody plant encroachment. Instead, channel modifications from the buildup of sediment behind measurement flumes may have increased transmission losses and reduced outlet discharge. Thus, alterations in channel properties can play an important role in the long‐term (45‐year) variations of rainfall–runoff dynamics of small desert catchments. 
    more » « less
  2. Eddy covariance serves as one the most effective techniques for long-term monitoring of ecosystem fluxes, however long-term data integrations rely on complete timeseries, meaning that any gaps due to missing data must be reliably filled. To date, many gap-filling approaches have been proposed and extensively evaluated for mature and/or less actively managed ecosystems. Random forest regression (RFR) has been shown to be stable and perform better in these systems than alternative approaches, particularly when filling longer gaps. However, the performance of RFR gap filling remains less certain in more challenging ecosystems, e.g., actively managed agri-ecosystems and following recent land-use change due to management disturbances, ecosystems with relatively low fluxes due to low signal to noise ratios, or for trace gases other than carbon dioxide (e.g., methane). In an extension to earlier work on gap filling global carbon dioxide, water, and energy fluxes, we assess the RFR approach for gap filling methane fluxes globally. We then investigate a range of gap-filling methodologies for carbon dioxide, water, energy, and methane fluxes in challenging ecosystems, including European managed pastures, Southeast Asian converted peatlands, and North American drylands. Our findings indicate that RFR is a competent alternative to existing research standard gap-filling algorithms. The marginal distribution sampling (MDS) is still suggested for filling short (< 12 days) gaps in carbon dioxide fluxes, but RFR is better for filling longer (> 30 days) gaps in carbon dioxide fluxes and also for gap filling other fluxes (e.g. sensible heat, latent energy and methane). In addition, using RFR with globally available reanalysis environmental drivers is effective when measured drivers are unavailable. Crucially, RFR was able to reliably fill cumulative fluxes for gaps > 3 moths and, unlike other common approaches, key environment-flux responses were preserved in the gap-filled data. 
    more » « less
  3. Long-term series of annual and seasonal water flow and major ions in the Pechora River were analyzed. Long-term phases of increased and decreased water flow were identified, ranging in duration from 11 to 49 years, and the major characteristics of these phases were determined. Changes in the sequence and boundaries of contrast phases in the annual and snowmelt spring–summer flood runoff were found to coincide. The difference between the mean seasonal water runoff during the phases of increased and decreased flow varied from 12 to 41%. The ion flow values of contrast phases typically differed by 9 to 36%, which is less than for water flow. This is due to the inverse dependence between ion concentrations and water discharge. Such peculiar negative feedback stabilizes the rates of chemical denudation in the river catchments to some extent and, thus, the discharge of major ions into seas, even during significant variations in water. 
    more » « less
  4. Abstract Stream fluxes are commonly reported without a complete accounting for uncertainty in the estimates, which makes it difficult to evaluate the significance of findings or to identify where to direct efforts to improve monitoring programs. At the Hubbard Brook Experimental Forest in the White Mountains of New Hampshire, USA, stream flow has been monitored continuously and solute concentrations have been sampled approximately weekly in small, gaged headwater streams since 1963, yet comprehensive uncertainty analyses have not been reported. We propagated uncertainty in the stage height–discharge relationship, watershed area, analytical chemistry, the concentration–discharge relationship used to interpolate solute concentrations, and the streamflow gap‐filling procedure to estimate uncertainty for both streamflow and solute fluxes for a recent 6‐year period (2013–2018) using a Monte Carlo approach. As a percentage of solute fluxes, uncertainty was highest for NH4+(34%), total dissolved nitrogen (8.8%), NO3(8.1%), and K+(7.4%), and lowest for dissolved organic carbon (3.7%), SO42−(4.0%), and Mg2+(4.4%). In units of flux, uncertainties were highest for solutes in highest concentration (Si, DOC, SO42−, and Na+) and lowest for those lowest in concentration (H+and NH4+). Laboratory analysis of solute concentration was a greater source of uncertainty than streamflow for solute flux, with the exception of DOC. Our results suggest that uncertainty in solute fluxes could be reduced with more precise measurements of solute concentrations. Additionally, more discharge measurements during high flows are needed to better characterize the stage‐discharge relationship. Quantifying uncertainty in streamflow and element export is important because it allows for determination of significance of differences in fluxes, which can be used to assess watershed response to disturbance and environmental change. 
    more » « less
  5. Eddy Covariance measurements are often subject to missing values, or gaps in the data record. Methods to fill short gaps are well-established, but robustly filling gaps longer than a few weeks remains a challenge. Marginal Distribution Sampling (MDS) is a standard gap-filling method, but its effectiveness for long gaps (> 30 days) is limited. We compared the performance of a machine learning algorithm, eXtreme Gradient Boosting (XGB) against MDS, using various artificial scenarios of gap lengths and locations. We gapfilled half hourly CO2 flux from a temperate deciduous forest, Bartlett Experimental Forest, from 2010 to 2022. Whereas the standard implementation of MDS uses a narrowly-prescribed set of predictor variables, with XGB we were able to include additional variables. The Green Chromatic Coordinate (GCC), derived from PhenoCam imagery, and diffuse photosynthetic photon flux density, emerged as two of the three most important predictor variables. Compared to MDS, the root mean square error (RMSE) of XGB decreased by 9.5 %, and the R2 increased by 2.7 % in a randomized 10-fold cross validation test. XGB outperformed MDS for both day and night times across different seasons. But annual NEE integrals varied across methods, with weaker annual net carbon uptake, by -110 ± 74 g C m-2 y-1 for XGB compared to MDS (214 ± 11 g C m-2 yr-1). In artificial gap experiments, when trained using the 13-year data record, XGB reliably filled gaps, showing little change in RMSE for gaps up to 240 days. In contrast, the performance of MDS steadily decreased as gap lengths increased. MDS was unable to fill gaps longer than 2 months. In summary, XGB demonstrates excellent performance as an alternative method to MDS, providing reliable predictions for temperate deciduous forest carbon fluxes under different gap lengths and location scenarios. Implementation of XGB is facilitated by easy-to-use packages. 
    more » « less