skip to main content


Title: Scalable Stochastic Programming with Bayesian Hybrid Models
Bayesian hybrid models (BHMs) fuse physics-based insights with machine learning constructs to correct for systematic bias. In this paper, we demonstrate a scalable computational strategy to embed BHMs in an equation-oriented modelling environment. Thus, this paper generalizes stochastic programming, which traditionally focuses on aleatoric uncertainty (as characterized by a probability distribution for uncertainty model parameters) to also consider epistemic uncertainty, i.e., mode-form uncertainty or systematic bias as modelled by the Gaussian process in the BHM. As an illustrative example, we consider ballistic firing using a BHM that includes a simplified glass-box (i.e., equation-oriented) model that neglects air resistance and a Gaussian process model to account for systematic bias (i.e., epistemic or model-form uncertainty) induced from the model simplification. The gravity parameter and the GP hypermeters are inferred from data in a Bayesian framework, yielding a posterior distribution. A novel single-stage stochastic program formulation using the posterior samples and Gaussian quadrature rules is proposed to compute the optimal decisions (e.g., firing angle and velocity) that minimize the expected value of an objective (e.g., distance from a stationary target). PySMO is used to generate expressions for the GP prediction mean and uncertainty in Pyomo, enabling efficient optimization with gradient-based solvers such as Ipopt. A scaling study characterizes the solver time and number of iterations for up to 2,000 samples from the posterior.  more » « less
Award ID(s):
1941596
NSF-PAR ID:
10403694
Author(s) / Creator(s):
; ;
Editor(s):
Yamashita, Y.; Kano, M.
Date Published:
Journal Name:
Computer aided chemical engineering
Volume:
49
ISSN:
2543-1331
Page Range / eLocation ID:
1309-1314
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Bayesian hybrid models fuse physics-based insights with machine learning constructs to correct for systematic bias. In this paper, we compare Bayesian hybrid models against physics-based glass-box and Gaussian process black-box surrogate models. We consider ballistic firing as an illustrative case study for a Bayesian decision-making workflow. First, Bayesian calibration is performed to estimate model parameters. We then use the posterior distribution from Bayesian analysis to compute optimal firing conditions to hit a target via a single-stage stochastic program. The case study demonstrates the ability of Bayesian hybrid models to overcome systematic bias from missing physics with fewer data than the pure machine learning approach. Ultimately, we argue Bayesian hybrid models are an emerging paradigm for data-informed decision-making under parametric and epistemic uncertainty. 
    more » « less
  2. Hybrid (i.e., grey-box) models are a powerful and flexible paradigm for predictive science and engineering. Grey-box models use data-driven constructs to incorporate unknown or computationally intractable phenomena into glass-box mechanistic models. The pioneering work of statisticians Kennedy and O’Hagan introduced a new paradigm to quantify epistemic (i.e., model-form) uncertainty. While popular in several engineering disciplines, prior work using Kennedy–O’Hagan hybrid models focuses on prediction with accurate uncertainty estimates. This work demonstrates computational strategies to deploy Bayesian hybrid models for optimization under uncertainty. Specifically, the posterior distributions of Bayesian hybrid models provide a principled uncertainty set for stochastic programming, chance-constrained optimization, or robust optimization. Through two illustrative case studies, we demonstrate the efficacy of hybrid models, composed of a structurally inadequate glass-box model and Gaussian process bias correction term, for decision-making using limited training data. From these case studies, we develop recommended best practices and explore the trade-offs between different hybrid model architectures. 
    more » « less
  3. Abstract Particle filters avoid parametric estimates for Bayesian posterior densities, which alleviates Gaussian assumptions in nonlinear regimes. These methods, however, are more sensitive to sampling errors than Gaussian-based techniques such as ensemble Kalman filters. A recent study by the authors introduced an iterative strategy for particle filters that match posterior moments—where iterations improve the filter’s ability to draw samples from non-Gaussian posterior densities. The iterations follow from a factorization of particle weights, providing a natural framework for combining particle filters with alternative filters to mitigate the impact of sampling errors. The current study introduces a novel approach to forming an adaptive hybrid data assimilation methodology, exploiting the theoretical strengths of nonparametric and parametric filters. At each data assimilation cycle, the iterative particle filter performs a sequence of updates while the prior sample distribution is non-Gaussian, then an ensemble Kalman filter provides the final adjustment when Gaussian distributions for marginal quantities are detected. The method employs the Shapiro–Wilk test to determine when to make the transition between filter algorithms, which has outstanding power for detecting departures from normality. Experiments using low-dimensional models demonstrate that the approach has a significant value, especially for nonhomogeneous observation networks and unknown model process errors. Moreover, hybrid factors are extended to consider marginals of more than one collocated variables using a test for multivariate normality. Findings from this study motivate the use of the proposed method for geophysical problems characterized by diverse observation networks and various dynamic instabilities, such as numerical weather prediction models. Significance Statement Data assimilation statistically processes observation errors and model forecast errors to provide optimal initial conditions for the forecast, playing a critical role in numerical weather forecasting. The ensemble Kalman filter, which has been widely adopted and developed in many operational centers, assumes Gaussianity of the prior distribution and solves a linear system of equations, leading to bias in strong nonlinear regimes. On the other hand, particle filters avoid many of those assumptions but are sensitive to sampling errors and are computationally expensive. We propose an adaptive hybrid strategy that combines their advantages and minimizes the disadvantages of the two methods. The hybrid particle filter–ensemble Kalman filter is achieved with the Shapiro–Wilk test to detect the Gaussianity of the ensemble members and determine the timing of the transition between these filter updates. Demonstrations in this study show that the proposed method is advantageous when observations are heterogeneous and when the model has an unknown bias. Furthermore, by extending the statistical hypothesis test to the test for multivariate normality, we consider marginals of more than one collocated variable. These results encourage further testing for real geophysical problems characterized by various dynamic instabilities, such as real numerical weather prediction models. 
    more » « less
  4. SUMMARY

    The spatio-temporal properties of seismicity give us incisive insight into the stress state evolution and fault structures of the crust. Empirical models based on self-exciting point processes continue to provide an important tool for analysing seismicity, given the epistemic uncertainty associated with physical models. In particular, the epidemic-type aftershock sequence (ETAS) model acts as a reference model for studying seismicity catalogues. The traditional ETAS model uses simple parametric definitions for the background rate of triggering-independent seismicity. This reduces the effectiveness of the basic ETAS model in modelling the temporally complex seismicity patterns seen in seismic swarms that are dominated by aseismic tectonic processes such as fluid injection rather than aftershock triggering. In order to robustly capture time-varying seismicity rates, we introduce a deep Gaussian process (GP) formulation for the background rate as an extension to ETAS. GPs are a robust non-parametric model for function spaces with covariance structure. By conditioning the length-scale structure of a GP with another GP, we have a deep-GP: a probabilistic, hierarchical model that automatically tunes its structure to match data constraints. We show how the deep-GP-ETAS model can be efficiently sampled by making use of a Metropolis-within-Gibbs scheme, taking advantage of the branching process formulation of ETAS and a stochastic partial differential equation (SPDE) approximation for Matérn GPs. We illustrate our method using synthetic examples, and show that the deep-GP-ETAS model successfully captures multiscale temporal behaviour in the background forcing rate of seismicity. We then apply the results to two real-data catalogues: the Ridgecrest, CA 2019 July 5 Mw 7.1 event catalogue, showing that deep-GP-ETAS can successfully characterize a classical aftershock sequence; and the 2016–2019 Cahuilla, CA earthquake swarm, which shows two distinct phases of aseismic forcing concordant with a fluid injection-driven initial sequence, arrest of the fluid along a physical barrier and release following the largest Mw 4.4 event of the sequence.

     
    more » « less
  5. Abstract

    We present a Bayesian hierarchical space‐time stochastic weather generator (BayGEN) to generate daily precipitation and minimum and maximum temperatures. BayGEN employs a hierarchical framework with data, process, and parameter layers. In the data layer, precipitation occurrence at each site is modeled using probit regression using a spatially distributed latent Gaussian process; precipitation amounts are modeled as gamma random variables; and minimum and maximum temperatures are modeled as realizations from Gaussian processes. The latent Gaussian process that drives the precipitation occurrence process is modeled in the process layer. In the parameter layer, the model parameters of the data and process layers are modeled as spatially distributed Gaussian processes, consequently enabling the simulation of daily weather at arbitrary (unobserved) locations or on a regular grid. All model parameters are endowed with weakly informative prior distributions. The No‐U Turn sampler, an adaptive form of Hamiltonian Monte Carlo, is used to maximize the model likelihood function and obtain posterior samples of each parameter. Posterior samples of the model parameters propagate uncertainty to the weather simulations, an important feature that makes BayGEN unique compared to traditional weather generators. We demonstrate the utility of BayGEN with application to daily weather generation in a basin of the Argentine Pampas. Furthermore, we evaluate the implications of crop yield by driving a crop simulation model with weather simulations from BayGEN and an equivalent non‐Bayesian weather generator.

     
    more » « less