skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Source Relationships and Model Structures Determine Information Flow Paths in Ecohydrologic Models
Abstract In a complex ecohydrologic system, vegetation and soil variables combine to dictate heat fluxes, and these fluxes may vary depending on the extent to which drivers are linearly or nonlinearly interrelated. From a modeling and causality perspective, uncertainty, sensitivity, and performance measures all relate to how information from different sources “flows” through a model to produce a target, or output. We address how model structure, broadly defined as a mapping from inputs to an output, combines with source dependencies to produce a range of information flow pathways from sources to a target. We apply information decomposition, which partitions reductions in uncertainty into synergistic, redundant, and unique information types, to a range of model cases. Toy models show that model structure and source dependencies both restrict the types of interactions that can arise between sources and targets. Regressions based on weather data illustrate how different model structures vary in their sensitivity to source dependencies, thus affecting predictive and functional performance. Finally, we compare the Surface Flux Equilibrium theory, a land‐surface model, and neural networks in estimating the Bowen ratio and find that models trade off information types particularly when sources have the highest and lowest dependencies. Overall, this study extends an information theory‐based model evaluation framework to incorporate the influence of source dependency on information pathways. This could be applied to explore behavioral ranges for both machine learning and process‐based models, and guide model development by highlighting model deficiencies based on information flow pathways that would not be apparent based on existing measures.  more » « less
Award ID(s):
2012850
PAR ID:
10375977
Author(s) / Creator(s):
 ;  
Publisher / Repository:
DOI PREFIX: 10.1029
Date Published:
Journal Name:
Water Resources Research
Volume:
58
Issue:
9
ISSN:
0043-1397
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Fujiwara, Masami (Ed.)
    The migration timing of Pacific salmon in the Columbia River basin is subject to multiple influences related to climate, human water resource management, and lagged effects such as oceanic conditions. We apply an information theory-based approach to analyze drivers of adult Chinook salmon migration within the spring and fall spawning seasons and between years based on salmon counts at dams along the Columbia and Snake Rivers. Time-lagged mutual information and information decomposition measures, which characterize lagged and nonlinear dependencies as reductions in uncertainty, are used to detect interactions between salmon counts and lagged streamflows, air and water temperatures, precipitation, snowpack, climate indices and downstream salmon counts. At a daily timescale, these interdependencies reflect migration timing and show differences between fall and spring run salmon, while dependencies based on variables at an annual resolution reflect long-term predictability. We also highlight several types of joint dependencies where predictability of salmon counts depends on the knowledge of multiple lagged sources. This study illustrates how co-varying human and natural drivers could propagate to influence salmon migration timing or overall returns, and how nonlinear types of dependencies between variables enhance predictability of a target. This information-based framework is broadly applicable to assess driving factors in other types of complex water resources systems or species life cycles. 
    more » « less
  2. Ecohydrological models vary in their sensitivity to forcing data and use available information to different extents. We focus on the impact of forcing precision on ecohydrological model behavior particularly by quantizing, or binning, time-series forcing variables. We use rate-distortion theory to quantize time-series forcing variables to different precisions. We evaluate the effect of different combinations of quantized shortwave radiation, air temperature, vapor pressure deficit, and wind speed on simulated heat and carbon fluxes for a multi-layer canopy model, which is forced and validated with eddy covariance flux tower observation data. We find that the model is more sensitive to radiation than meteorological forcing input, but model responses also vary with seasonal conditions and different combinations of quantized inputs. While any level of quantization impacts carbon flux similarly, specific levels of quantization influence heat fluxes to different degrees. This study introduces a method to optimally simplify forcing time series, often without significantly decreasing model performance, and could be applied within a sensitivity analysis framework to better understand how models use available information. 
    more » « less
  3. Flow–ecology relationships are critical for developing and adaptively managing environmental flows. However, uncertainty often arises from data limitations and an incomplete understanding of the spatial and temporal attributes inherent to each relationship. Accounting for sources of uncertainty is critical given the mounting interest in implementing environmental flows at large scales, often with limited information. We used the South Fork Eel River watershed in northern California as a case study to demonstrate how data gaps and uncertainty in flow–ecology relationships may be better quantified. A rigorous literature review revealed that few flow–ecology relationships related directly to the flow regime, and none spanned the full range of hydrologic or geomorphic variability exhibited across the watershed. Identified data gaps informed several sensitivity analyses within a Bayesian network model which showed that the modeled ecological outcome differed by as much as 50% depending on the type and magnitude of uncertainty. This study presents a general regional framework for quantifying spatial and temporal data gaps that can be applied to other watersheds and information types to improve representation of uncertainty in flow–ecology relationships and to inform environmental flow design. 
    more » « less
  4. An interesting behavior in large language models (LLMs) is prompt sensitivity. When provided with different but semantically equivalent versions of the same prompt, models may produce very different distributions of answers. This suggests that the uncertainty reflected in a model's output distribution for one prompt may not reflect the model's uncertainty about the meaning of the prompt. We model prompt sensitivity as a type of generalization error, and show that sampling across the semantic concept space with paraphrasing perturbations improves uncertainty calibration without compromising accuracy. Additionally, we introduce a new metric for uncertainty decomposition in black-box LLMs that improves upon entropy-based decomposition by modeling semantic continuities in natural language generation. We show that this decomposition metric can be used to quantify how much LLM uncertainty is attributed to prompt sensitivity. Our work introduces a new way to improve uncertainty calibration in prompt-sensitive language models, and provides evidence that some LLMs fail to exhibit consistent general reasoning about the meanings of their inputs. 
    more » « less
  5. null (Ed.)
    Global sensitivity analysis aims at quantifying and ranking the relative contribution of all the uncertain inputs of a mathematical model that impact the uncertainty in the output of that model, for any input-output mapping. Motivated by the limitations of the well-established Sobol' indices which are variance-based, there has been an interest in the development of non-moment-based global sensitivity metrics. This paper presents two complementary classes of metrics (one of which is a generalization of an already existing metric in the literature) which are based on the statistical distances between probability distributions rather than statistical moments. To alleviate the large computational cost associated with Monte Carlo sampling of the input-output model to estimate probability distributions, polynomial chaos surrogate models are proposed to be used. The surrogate models in conjunction with sparse quadrature-based rules, such as conjugate unscented transforms, permit efficient calculation of the proposed global sensitivity measures. Three benchmark sensitivity analysis examples are used to illustrate the proposed approach. 
    more » « less