skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A data-driven method for estimating the composition of end-members from stream water chemistry time series
Abstract. End-member mixing analysis (EMMA) is a method of interpreting stream water chemistry variations and is widely used for chemical hydrograph separation. It is based on the assumption that stream water is a conservative mixture of varying contributions from well-characterized source solutions (end-members). These end-members are typically identified by collecting samples of potential end-member source waters from within the watershed and comparing these to the observations. Here we introduce a complementary data-driven method (convex hull end-member mixing analysis – CHEMMA) to infer the end-member compositions and their associated uncertainties from the stream water observations alone. The method involves two steps. The first uses convex hull nonnegative matrix factorization (CH-NMF) to infer possible end-member compositions by searching for a simplex that optimally encloses the stream water observations. The second step uses constrained K-means clustering (COP-KMEANS) to classify the results from repeated applications of CH-NMF and analyzes the uncertainty associated with the algorithm. In an example application utilizing the 1986 to 1988 Panola Mountain Research Watershed dataset, CHEMMA is able to robustly reproduce the three field-measured end-members found in previous research using only the stream water chemical observations. CHEMMA also suggests that a fourth and a fifth end-member can be (less robustly) identified. We examine uncertainties in end-member identification arising from non-uniqueness, which is related to the data structure, of the CH-NMF solutions, and from the number of samples using both real and synthetic data. The results suggest that the mixing space can be identified robustly when the dataset includes samples that contain extremely small contributions of one end-member, i.e., samples containing extremely large contributions from one end-member are not necessary but do reduce uncertainty about the end-member composition.  more » « less
Award ID(s):
1654194
PAR ID:
10336617
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Hydrology and Earth System Sciences
Volume:
26
Issue:
8
ISSN:
1607-7938
Page Range / eLocation ID:
1977 to 1991
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. ABSTRACT Hydrologic connectivity is defined as the connection among stores of water within a watershed and controls the flux of water and solutes from the subsurface to the stream. Hydrologic connectivity is difficult to quantify because it is goverened by heterogeniety in subsurface storage and permeability and responds to seasonal changes in precipitation inputs and subsurface moisture conditions. How interannual climate variability impacts hydrologic connectivity, and thus stream flow generation and chemistry, remains unclear. Using a rare, four‐year synoptic stream chemistry dataset, we evaluated shifts in stream chemistry and stream flow source of Coal Creek, a montane, headwater tributary of the Upper Colorado River. We leveraged compositional principal component analysis and end‐member mixing to evaluate how seasonal and interannual variation in subsurface moisture conditions impacts stream chemistry. Overall, three main findings emerged from this work. First, three geochemically distinct end members were identified that constrained stream flow chemistry: reach inflows, and quick and slow flow groundwater contributions. Reach inflows were impacted by historic base and precious metal mine inputs. Bedrock fractures facilitated much of the transport of quick flow groundwater and higher‐storage subsurface features (e.g., alluvial fans) facilitated the transport of slow flow groundwater. Second, the contributions of different end members to the stream changed over the summer. In early summer, stream flow was composed of all three end members, while in late summer, it was composed predominantly of reach inflows and slow flow groundwater. Finally, we observed minimal differences in proportional composition in stream chemistry across all four years, indicating seasonal variability in subsurface moisture and spatial heterogeneity in landscape and geologic features had a greater influence than interannual climate fluctuation on hydrologic connectivity and stream water chemistry. These findings indicate that mechanisms controlling solute transport (e.g., hydrologic connectivity and flow path activation) may be resilient (i.e., able to rebound after perturbations) to predicted increases in climate variability. By establishing a framework for assessing compositional stream chemistry across variable hydrologic and subsurface moisture conditions, our study offers a method to evaluate watershed biogeochemical resilience to variations in hydrometeorological conditions. 
    more » « less
  2. Abstract End‐member mixing analysis (EMMA) is widely used to analyze geoscience data for their end‐members and mixing proportions. Many traditional EMMA methods depend on known end‐members, which are sometimes uncertain or unknown. Unsupervised EMMA methods infer end‐members from data, but many existing ones don't strictly follow necessary constraints and lack full mathematical interpretability. Here, we introduce a novel unsupervised machine learning method, simplex projected gradient descent‐archetypal analysis (SPGD‐AA), which uses the ML model archetypal analysis to infer end‐members intuitively and interpretably without prior knowledge. SPGD‐AA uses extreme corners in data as end‐members or “archetypes,” and represents data as mixtures of end‐members. This method is most suitable for linear (conservative) mixing problems when samples with similar characteristics to end‐members are present in data. Validation on synthetic and real data sets, including river chemistry, deep‐sea sediment elemental composition, and hyperspectral imaging, shows that SPGD‐AA effectively recovers end‐members consistent with domain expertise and outperforms conventional approaches. SPGD‐AA is applicable to a wide range of geoscience data sets and beyond. 
    more » « less
  3. Active learning is a valuable tool for efficiently exploring complex spaces, finding a variety of uses in materials science. However, the determination of convex hulls for phase diagrams does not neatly fit into traditional active learning approaches due to their global nature. Specifically, the thermodynamic stability of a material is not simply a function of its own energy, but rather requires energetic information from all other competing compositions and phases. Here we present Convex hull-aware Active Learning (CAL), a novel Bayesian algorithm that chooses experiments to minimize the uncertainty in the convex hull. CAL prioritizes compositions that are close to or on the hull, leaving significant uncertainty in other compositions that are quickly determined to be irrelevant to the convex hull. The convex hull can thus be predicted with significantly fewer observations than approaches that focus solely on energy. Intrinsic to this Bayesian approach is uncertainty quantification in both the convex hull and all subsequent predictions (e.g., stability and chemical potential). By providing increased search efficiency and uncertainty quantification, CAL can be readily incorporated into the emerging paradigm of uncertainty-based workflows for thermodynamic prediction. 
    more » « less
  4. Plant leaf waxes and their isotopic composition are important tracers of ecological, environmental, and climate variability, with strong preservation potential in sedimentary archives. However, they represent an integrated, and often complicated, signal of vegetation and hydrology within a watershed. Here, we report a new approach for examining complex mixtures of n-alkanes in sediments and their isotope values: non-negative matrix factorization (NMF). NMF identifies the endmembers in a mixture from the integrated n-alkane data and provides quantitative information on the relative importance of those endmembers across samples. We apply this approach to a synthetic dataset and two previously published datasets to illustrate its uses. Our application of NMF to re-analyse previously published data reveals new insights into past climate and ecological change. We demonstrate that NMF allows a user to 1) identify potential mixing problems, 2) evaluate which specific compounds in a mixture carry the isotope signal that can best address a given scientific objective, 3) determine compound concentrations after excluding contributions from particular endmember sources, and 4) calculate isotope values of different sources. NMF provides a quantitative approach for evaluating the influence of endmember mixing on molecular concentrations and isotope values within a dataset. The re-analysis of two published datasets reveals new quantitative insight into Holocene Arctic climate and Neogene vegetation change. 
    more » « less
  5. Abstract The shallow and deep hypothesis suggests that stream concentration‐discharge (CQ) relationships are shaped by distinct source waters from different depths. Under this hypothesis, baseflows are typically dominated by groundwater and mostly reflect groundwater chemistry, whereas high flows are typically dominated by shallow soil water and mostly reflect soil water chemistry. Aspects of this hypothesis draw on applications like end member mixing analyses and hydrograph separation, yet direct data support for the hypothesis remains scarce. This work tests the shallow and deep hypothesis using co‐located measurements of soil water, groundwater, and streamwater chemistry at two intensively monitored sites, the W‐9 catchment at Sleepers River (Vermont, United States) and the Hafren catchment at Plynlimon (Wales). At both sites, depth profiles of subsurface water chemistry and stream CQ relationships for the 10 solutes analyzed are broadly consistent with the hypothesis. Solutes that are more abundant at depth (e.g., calcium) exhibit dilution patterns (concentration decreases with increasing discharge). Conversely, solutes enriched in shallow soils (e.g., nitrate) generally exhibit flushing patterns (concentration increases with increasing discharge). The hypothesis may hold broadly true for catchments that share such biogeochemical stratifications in the subsurface. Soil water and groundwater chemistries were estimated from high‐ and low‐flow stream chemistries with average relative errors ranging from 24% to 82%. This indicates that streams mirror subsurface waters: stream chemistry can be used to infer scarcely measured subsurface water chemistry, especially where there are distinct shallow and deep end members. 
    more » « less