skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Value‐Based Model Selection Approach for Environmental Random Variables
Abstract Environmental decisions with substantial social and environmental implications are regularly informed by model predictions, incurring inevitable uncertainty. The selection of a set of model predictions to inform a decision is usually based on model performance, measured by goodness‐of‐fit metrics. Yet goodness‐of‐fit metrics have a questionable relationship to a model's value to end users, particularly when validation data are themselves uncertain. For example, decisions based on flow frequency models are not necessarily improved by adopting models with the best overall goodness of fit. We propose an alternative model evaluation approach based on the conditional value of sample information, first defined in 1961, which has found extensive use in sampling design optimization but which has not previously been used for model evaluation. The metric uses observations from a validation set to estimate the expected monetary costs associated with model prediction uncertainties. A model is only considered superior to alternatives if (i) its predictions reduce these costs and (ii) sufficient validation data are available to distinguish its performance from alternative models. By describing prediction uncertainties in monetary terms, the metric facilitates the communication of prediction uncertainty by end users, supporting the inclusion of uncertainty analysis in decision making.  more » « less
Award ID(s):
1824951
PAR ID:
10379294
Author(s) / Creator(s):
 ;  
Publisher / Repository:
DOI PREFIX: 10.1029
Date Published:
Journal Name:
Water Resources Research
Volume:
55
Issue:
1
ISSN:
0043-1397
Page Range / eLocation ID:
p. 270-283
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper advances machine learning (ML)-based streamflow prediction by strategically selecting rainfall events, introducing a new loss function, and addressing rainfall forecast uncertainties. Focusing on the Iowa River Basin, we applied the stochastic storm transposition (SST) method to create realistic rainfall events, which were input into a hydrological model to generate corresponding streamflow data for training and testing deterministic and probabilistic ML models. Long short-term memory (LSTM) networks were employed to predict streamflow up to 12 h ahead. An active learning approach was used to identify the most informative rainfall events, reducing data generation effort. Additionally, we introduced a novel asymmetric peak loss function to improve peak streamflow prediction accuracy. Incorporating rainfall forecast uncertainties, our probabilistic LSTM model provided uncertainty quantification for streamflow predictions. Performance evaluation using different metrics improved the accuracy and reliability of our models. These contributions enhance flood forecasting and decision-making while significantly reducing computational time and costs. 
    more » « less
  2. Abstract 13C‐Metabolic Flux Analysis (13C‐MFA) and Flux Balance Analysis (FBA) are widely used to investigate the operation of biochemical networks in both biological and biotechnological research. Both methods use metabolic reaction network models of metabolism operating at steady state so that reaction rates (fluxes) and the levels of metabolic intermediates are constrained to be invariant. They provide estimated (MFA) or predicted (FBA) values of the fluxes through the network in vivo, which cannot be measured directly. These fluxes can shed light on basic biology and have been successfully used to inform metabolic engineering strategies. Several approaches have been taken to test the reliability of estimates and predictions from constraint‐based methods and to compare alternative model architectures. Despite advances in other areas of the statistical evaluation of metabolic models, such as the quantification of flux estimate uncertainty, validation and model selection methods have been underappreciated and underexplored. We review the history and state‐of‐the‐art in constraint‐based metabolic model validation and model selection. Applications and limitations of the χ2‐test of goodness‐of‐fit, the most widely used quantitative validation and selection approach in 13C‐MFA, are discussed, and complementary and alternative forms of validation and selection are proposed. A combined model validation and selection framework for 13C‐MFA incorporating metabolite pool size information that leverages new developments in the field is presented and advocated for. Finally, we discuss how adopting robust validation and selection procedures can enhance confidence in constraint‐based modeling as a whole and ultimately facilitate more widespread use of FBA in biotechnology. 
    more » « less
  3. Viale, R. (Ed.)
    Alternative-based approaches to decision making generate overall values for each option in a choice set by processing information within options before comparing options to arrive at a decision. By contrast, attribute-based approaches compare attributes (such as monetary cost and time delay to receipt of a reward) across options and use these attribute comparisons to make a decision. Because they compare attributes, they may not use all available information to make a choice, which categorizes many of them as heuristics. Attribute-based models can better predict choice compared to alternative-based models in some situations (e.g., when there are many options in the choice set, when calculating an overall value for an option is too cognitively taxing). Process data comparing alternative-based and attribute-based processing obtained from eye-tracking and mouse-tracking technology support these findings. Data on attribute-based models thus align with the notion of bounded rationality that people make use of heuristics to make good decisions when under time pressure, informational constraints, and computational constraints. Further study of attribute-based models and processing would enhance our understanding of how individuals process information and make decisions. 
    more » « less
  4. null (Ed.)
    Abstract Radiogenomics uses machine-learning (ML) to directly connect the morphologic and physiological appearance of tumors on clinical imaging with underlying genomic features. Despite extensive growth in the area of radiogenomics across many cancers, and its potential role in advancing clinical decision making, no published studies have directly addressed uncertainty in these model predictions. We developed a radiogenomics ML model to quantify uncertainty using transductive Gaussian Processes (GP) and a unique dataset of 95 image-localized biopsies with spatially matched MRI from 25 untreated Glioblastoma (GBM) patients. The model generated predictions for regional EGFR amplification status (a common and important target in GBM) to resolve the intratumoral genetic heterogeneity across each individual tumor—a key factor for future personalized therapeutic paradigms. The model used probability distributions for each sample prediction to quantify uncertainty, and used transductive learning to reduce the overall uncertainty. We compared predictive accuracy and uncertainty of the transductive learning GP model against a standard GP model using leave-one-patient-out cross validation. Additionally, we used a separate dataset containing 24 image-localized biopsies from 7 high-grade glioma patients to validate the model. Predictive uncertainty informed the likelihood of achieving an accurate sample prediction. When stratifying predictions based on uncertainty, we observed substantially higher performance in the group cohort (75% accuracy, n = 95) and amongst sample predictions with the lowest uncertainty (83% accuracy, n = 72) compared to predictions with higher uncertainty (48% accuracy, n = 23), due largely to data interpolation (rather than extrapolation). On the separate validation set, our model achieved 78% accuracy amongst the sample predictions with lowest uncertainty. We present a novel approach to quantify radiogenomics uncertainty to enhance model performance and clinical interpretability. This should help integrate more reliable radiogenomics models for improved medical decision-making. 
    more » « less
  5. Machine learning provides valuable information for data-driven decision-making. However, real-world problems commonly include uncertainties and the features needed to generate the prediction outputs are random variables. Even the most reliable machine learning models may not be helpful for decision-makers when the decisions must be taken before the values of features used in machine learning models are realized. To support decision-making under uncertainty, we propose a scenario generation procedure for stochastic programs that incorporates the uncertainties in both prediction features and the machine learning model prediction error. A statistical test is implemented to assess the reliability of the scenario sets by comparison with corresponding historical observations. We test the whole procedure in a case study for crop yield in Midwest. 
    more » « less