skip to main content

This content will become publicly available on July 17, 2023

Title: Volatility Based Kernels and Moving Average Means for Accurate Forecasting with Gaussian Processes
A broad class of stochastic volatility models are defined by systems of stochastic differential equations, and while these models have seen widespread success in domains such as finance and statistical climatology, they typically lack an ability to condition on historical data to produce a true posterior distribution. To address this fundamental limitation, we show how to re-cast a class of stochastic volatility models as a hierarchical Gaussian process (GP) model with specialized covariance functions. This GP model retains the inductive biases of the stochastic volatility model while providing the posterior predictive distribution given by GP inference. Within this framework, we take inspiration from well studied domains to introduce a new class of models, Volt and Magpie, that significantly outperform baselines in stock and wind speed forecasting, and naturally extend to the multitask setting.
; ;
Award ID(s):
Publication Date:
Journal Name:
Proceedings of the 39th International Conference on Machine Learning
Sponsoring Org:
National Science Foundation
More Like this
  1. Yamashita, Y. ; Kano, M. (Ed.)
    Bayesian hybrid models (BHMs) fuse physics-based insights with machine learning constructs to correct for systematic bias. In this paper, we demonstrate a scalable computational strategy to embed BHMs in an equation-oriented modelling environment. Thus, this paper generalizes stochastic programming, which traditionally focuses on aleatoric uncertainty (as characterized by a probability distribution for uncertainty model parameters) to also consider epistemic uncertainty, i.e., mode-form uncertainty or systematic bias as modelled by the Gaussian process in the BHM. As an illustrative example, we consider ballistic firing using a BHM that includes a simplified glass-box (i.e., equation-oriented) model that neglects air resistance and a Gaussian process model to account for systematic bias (i.e., epistemic or model-form uncertainty) induced from the model simplification. The gravity parameter and the GP hypermeters are inferred from data in a Bayesian framework, yielding a posterior distribution. A novel single-stage stochastic program formulation using the posterior samples and Gaussian quadrature rules is proposed to compute the optimal decisions (e.g., firing angle and velocity) that minimize the expected value of an objective (e.g., distance from a stationary target). PySMO is used to generate expressions for the GP prediction mean and uncertainty in Pyomo, enabling efficient optimization with gradient-based solvers such asmore »Ipopt. A scaling study characterizes the solver time and number of iterations for up to 2,000 samples from the posterior.« less
  2. Inference-based optimization via simulation, which substitutes Gaussian process (GP) learning for the structural properties exploited in mathematical programming, is a powerful paradigm that has been shown to be remarkably effective in problems of modest feasible-region size and decision-variable dimension. The limitation to “modest” problems is a result of the computational overhead and numerical challenges encountered in computing the GP conditional (posterior) distribution on each iteration. In this paper, we substantially expand the size of discrete-decision-variable optimization-via-simulation problems that can be attacked in this way by exploiting a particular GP—discrete Gaussian Markov random fields—and carefully tailored computational methods. The result is the rapid Gaussian Markov Improvement Algorithm (rGMIA), an algorithm that delivers both a global convergence guarantee and finite-sample optimality-gap inference for significantly larger problems. Between infrequent evaluations of the global conditional distribution, rGMIA applies the full power of GP learning to rapidly search smaller sets of promising feasible solutions that need not be spatially close. We carefully document the computational savings via complexity analysis and an extensive empirical study. Summary of Contribution: The broad topic of the paper is optimization via simulation, which means optimizing some performance measure of a system that may only be estimated by executing a stochastic,more »discrete-event simulation. Stochastic simulation is a core topic and method of operations research. The focus of this paper is on significantly speeding-up the computations underlying an existing method that is based on Gaussian process learning, where the underlying Gaussian process is a discrete Gaussian Markov Random Field. This speed-up is accomplished by employing smart computational linear algebra, state-of-the-art algorithms, and a careful divide-and-conquer evaluation strategy. Problems of significantly greater size than any other existing algorithm with similar guarantees can solve are solved as illustrations.« less
  3. Abstract. Secondary organic aerosol derived from isopreneepoxydiols (IEPOX-SOA) is thought to contribute the dominant fraction oftotal isoprene SOA, but the current volatility-based lumped SOAparameterizations are not appropriate to represent the reactive uptake ofIEPOX onto acidified aerosols. A full explicit modeling of this chemistryis however computationally expensive owing to the many species and reactionstracked, which makes it difficult to include it in chemistry–climate modelsfor long-term studies. Here we present three simplified parameterizations(version 1.0) for IEPOX-SOA simulation, based on an approximateanalytical/fitting solution of the IEPOX-SOA yield and formation timescale.The yield and timescale can then be directly calculated using the globalmodel fields of oxidants, NO, aerosol pH and other key properties, and drydeposition rates. The advantage of the proposed parameterizations is thatthey do not require the simulation of the intermediates while retaining thekey physicochemical dependencies. We have implemented the newparameterizations into the GEOS-Chem v11-02-rc chemical transport model,which has two empirical treatments for isoprene SOA (the volatility-basis-set, VBS, approach and a fixed 3 % yield parameterization), and comparedall of them to the case with detailed fully explicit chemistry. The bestparameterization (PAR3) captures the global tropospheric burden of IEPOX-SOAand its spatiotemporal distribution (R2=0.94) vs. thosesimulated by the full chemistry, while being more computationally efficient(∼5 times faster),more »and accurately captures the response tochanges in NOx and SO2 emissions. On the other hand, the constant3 % yield that is now the default in GEOS-Chem deviates strongly (R2=0.66), as does the VBS (R2=0.47, 49 % underestimation), withneither parameterization capturing the response to emission changes. Withthe advent of new mass spectrometry instrumentation, many detailed SOAmechanisms are being developed, which will challenge global and especiallyclimate models with their computational cost. The methods developed in thisstudy can be applied to other SOA pathways, which can allow includingaccurate SOA simulations in climate and global modeling studies in thefuture.

    « less
  4. We present a maximum-margin sparse Gaussian Process (MM-SGP) for active learning (AL) of classification models for multi-class problems. The proposed model makes novel extensions to a GP by integrating maximum-margin constraints into its learning process, aiming to further improve its predictive power while keeping its inherent capability for uncertainty quantification. The MM constraints ensure small "effective size" of the model, which allows MM-SGP to provide good predictive performance by using limited" active" data samples, a critical property for AL. Furthermore, as a Gaussian process model, MM-SGP will output both the predicted class distribution and the predictive variance, both of which are essential for defining a sampling function effective to improve the decision boundaries of a large number of classes simultaneously. Finally, the sparse nature of MM-SGP ensures that it can be efficiently trained by solving a low-rank convex dual problem. Experiment results on both synthetic and real-world datasets show the effectiveness and efficiency of the proposed AL model.
  5. Gaussian processes (GPs) offer a flexible class of priors for nonparametric Bayesian regression, but popular GP posterior inference methods are typically prohibitively slow or lack desirable finite-data guarantees on quality. We develop a scalable approach to approximate GP regression, with finite-data guarantees on the accuracy of our pointwise posterior mean and variance estimates. Our main contribution is a novel objective for approximate inference in the nonparametric setting: the preconditioned Fisher (pF) divergence. We show that unlike the Kullback–Leibler divergence (used in variational inference), the pF divergence bounds bounds the 2-Wasserstein distance, which in turn provides tight bounds on the pointwise error of mean and variance estimates. We demonstrate that, for sparse GP likelihood approximations, we can minimize the pF divergence bounds efficiently. Our experiments show that optimizing the pF divergence bounds has the same computational requirements as variational sparse GPs while providing comparable empirical performance—in addition to our novel finite-data quality guarantees.