skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Validation of scenario generation for decision-making using machine learning prediction models: A case study for crop yield
Machine learning provides valuable information for data-driven decision-making. However, real-world problems commonly include uncertainties and the features needed to generate the prediction outputs are random variables. Even the most reliable machine learning models may not be helpful for decision-makers when the decisions must be taken before the values of features used in machine learning models are realized. To support decision-making under uncertainty, we propose a scenario generation procedure for stochastic programs that incorporates the uncertainties in both prediction features and the machine learning model prediction error. A statistical test is implemented to assess the reliability of the scenario sets by comparison with corresponding historical observations. We test the whole procedure in a case study for crop yield in Midwest.  more » « less
Award ID(s):
1828942
PAR ID:
10443575
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Optimization Letters
ISSN:
1862-4472
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Mitigating the adverse impacts caused by increasing flood risks in urban coastal communities requires effective flood prediction for prompt action. Typically, physics‐based 1‐D pipe/2‐D overland flow models are used to simulate urban pluvial flooding. Because these models require significant computational resources and have long run times, they are often unsuitable for real‐time flood prediction at a street scale. This study explores the potential of a machine learning method, Random Forest (RF), to serve as a surrogate model for urban flood predictions. The surrogate model was trained to relate topographic and environmental features to hourly water depths simulated by a high‐resolution 1‐D/2‐D physics‐based model at 16,914 road segments in the coastal city of Norfolk, Virginia, USA. Two training scenarios for the RF model were explored: (i) training on only the most flood‐prone street segments in the study area and (ii) training on all 16,914 street segments in the study area. The RF model yielded high predictive skill, especially for the scenario when the model was trained on only the most flood‐prone streets. The results also showed that the surrogate model reduced the computational run time of the physics‐based model by a factor of 3,000, making real‐time decision support more feasible compared to using the full physics‐based model. We concluded that machine learning surrogate models strategically trained on high‐resolution and high‐fidelity physics‐based models have the potential to significantly advance the ability to support decision making in real‐time flood management within urban communities. 
    more » « less
  2. Conventional computational models of climate adaptation frameworks inadequately consider decision-makers’ capacity to learn, update, and improve decisions. Here, we investigate the potential of reinforcement learning (RL), a machine learning technique that efficaciously acquires knowledge from the environment and systematically optimizes dynamic decisions, in modeling and informing adaptive climate decision-making. We consider coastal flood risk mitigations for Manhattan, New York City, USA (NYC), illustrating the benefit of continuously incorporating observations of sea-level rise into systematic designs of adaptive strategies. We find that when designing adaptive seawalls to protect NYC, the RL-derived strategy significantly reduces the expected net cost by 6 to 36% under the moderate emissions scenario SSP2-4.5 (9 to 77% under the high emissions scenario SSP5-8.5), compared to conventional methods. When considering multiple adaptive policies, including accomodation and retreat as well as protection, the RL approach leads to a further 5% (15%) cost reduction, showing RL’s flexibility in coordinatively addressing complex policy design problems. RL also outperforms conventional methods in controlling tail risk (i.e., low probability, high impact outcomes) and in avoiding losses induced by misinformation about the climate state (e.g., deep uncertainty), demonstrating the importance of systematic learning and updating in addressing extremes and uncertainties related to climate adaptation. 
    more » « less
  3. Abstract Machine learning (ML) has been applied to space weather problems with increasing frequency in recent years, driven by an influx of in-situ measurements and a desire to improve modeling and forecasting capabilities throughout the field. Space weather originates from solar perturbations and is comprised of the resulting complex variations they cause within the numerous systems between the Sun and Earth. These systems are often tightly coupled and not well understood. This creates a need for skillful models with knowledge about the confidence of their predictions. One example of such a dynamical system highly impacted by space weather is the thermosphere, the neutral region of Earth’s upper atmosphere. Our inability to forecast it has severe repercussions in the context of satellite drag and computation of probability of collision between two space objects in low Earth orbit (LEO) for decision making in space operations. Even with (assumed) perfect forecast of model drivers, our incomplete knowledge of the system results in often inaccurate thermospheric neutral mass density predictions. Continuing efforts are being made to improve model accuracy, but density models rarely provide estimates of confidence in predictions. In this work, we propose two techniques to develop nonlinear ML regression models to predict thermospheric density while providing robust and reliable uncertainty estimates: Monte Carlo (MC) dropout and direct prediction of the probability distribution, both using the negative logarithm of predictive density (NLPD) loss function. We show the performance capabilities for models trained on both local and global datasets. We show that the NLPD loss provides similar results for both techniques but the direct probability distribution prediction method has a much lower computational cost. For the global model regressed on the Space Environment Technologies High Accuracy Satellite Drag Model (HASDM) density database, we achieve errors of approximately 11% on independent test data with well-calibrated uncertainty estimates. Using an in-situ CHAllenging Minisatellite Payload (CHAMP) density dataset, models developed using both techniques provide test error on the order of 13%. The CHAMP models—on validation and test data—are within 2% of perfect calibration for the twenty prediction intervals tested. We show that this model can also be used to obtain global density predictions with uncertainties at a given epoch. 
    more » « less
  4. Abstract Robust quantification of predictive uncertainty is a critical addition needed for machine learning applied to weather and climate problems to improve the understanding of what is driving prediction sensitivity. Ensembles of machine learning models provide predictive uncertainty estimates in a conceptually simple way but require multiple models for training and prediction, increasing computational cost and latency. Parametric deep learning can estimate uncertainty with one model by predicting the parameters of a probability distribution but does not account for epistemic uncertainty. Evidential deep learning, a technique that extends parametric deep learning to higher-order distributions, can account for both aleatoric and epistemic uncertainties with one model. This study compares the uncertainty derived from evidential neural networks to that obtained from ensembles. Through applications of the classification of winter precipitation type and regression of surface-layer fluxes, we show evidential deep learning models attaining predictive accuracy rivaling standard methods while robustly quantifying both sources of uncertainty. We evaluate the uncertainty in terms of how well the predictions are calibrated and how well the uncertainty correlates with prediction error. Analyses of uncertainty in the context of the inputs reveal sensitivities to underlying meteorological processes, facilitating interpretation of the models. The conceptual simplicity, interpretability, and computational efficiency of evidential neural networks make them highly extensible, offering a promising approach for reliable and practical uncertainty quantification in Earth system science modeling. To encourage broader adoption of evidential deep learning, we have developed a new Python package, Machine Integration and Learning for Earth Systems (MILES) group Generalized Uncertainty for Earth System Science (GUESS) (MILES-GUESS) (https://github.com/ai2es/miles-guess), that enables users to train and evaluate both evidential and ensemble deep learning. Significance StatementThis study demonstrates a new technique, evidential deep learning, for robust and computationally efficient uncertainty quantification in modeling the Earth system. The method integrates probabilistic principles into deep neural networks, enabling the estimation of both aleatoric uncertainty from noisy data and epistemic uncertainty from model limitations using a single model. Our analyses reveal how decomposing these uncertainties provides valuable insights into reliability, accuracy, and model shortcomings. We show that the approach can rival standard methods in classification and regression tasks within atmospheric science while offering practical advantages such as computational efficiency. With further advances, evidential networks have the potential to enhance risk assessment and decision-making across meteorology by improving uncertainty quantification, a longstanding challenge. This work establishes a strong foundation and motivation for the broader adoption of evidential learning, where properly quantifying uncertainties is critical yet lacking. 
    more » « less
  5. We introduce temporal multimodal multivariate learning, a new family of decision making models that can indirectly learn and transfer online information from simultaneous observations of a probability distribution with more than one peak or more than one outcome variable from one time stage to another. We approximate the posterior by sequentially removing additional uncertainties across different variables and time, based on data-physics driven correlation, to address a broader class of challenging time-dependent decision-making problems under uncertainty. Extensive experiments on real-world datasets ( i.e., urban traffic data and hurricane ensemble forecasting data) demonstrate the superior performance of the proposed targeted decision-making over the state-of-the-art baseline prediction methods across various settings. 
    more » « less