skip to main content


Title: Learning stochastic closures using ensemble Kalman inversion
Abstract Although the governing equations of many systems, when derived from first principles, may be viewed as known, it is often too expensive to numerically simulate all the interactions they describe. Therefore, researchers often seek simpler descriptions that describe complex phenomena without numerically resolving all the interacting components. Stochastic differential equations (SDEs) arise naturally as models in this context. The growth in data acquisition, both through experiment and through simulations, provides an opportunity for the systematic derivation of SDE models in many disciplines. However, inconsistencies between SDEs and real data at short time scales often cause problems, when standard statistical methodology is applied to parameter estimation. The incompatibility between SDEs and real data can be addressed by deriving sufficient statistics from the time-series data and learning parameters of SDEs based on these. Here, we study sufficient statistics computed from time averages, an approach that we demonstrate to lead to sufficient statistics on a variety of problems and that has the secondary benefit of obviating the need to match trajectories. Following this approach, we formulate the fitting of SDEs to sufficient statistics from real data as an inverse problem and demonstrate that this inverse problem can be solved by using ensemble Kalman inversion. Furthermore, we create a framework for non-parametric learning of drift and diffusion terms by introducing hierarchical, refinable parameterizations of unknown functions, using Gaussian process regression. We demonstrate the proposed methodology for the fitting of SDE models, first in a simulation study with a noisy Lorenz ’63 model, and then in other applications, including dimension reduction in deterministic chaotic systems arising in the atmospheric sciences, large-scale pattern modeling in climate dynamics and simplified models for key observables arising in molecular dynamics. The results confirm that the proposed methodology provides a robust and systematic approach to fitting SDE models to real data.  more » « less
Award ID(s):
1818977 1835860
NSF-PAR ID:
10338270
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Transactions of Mathematics and Its Applications
Volume:
5
Issue:
1
ISSN:
2398-4945
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Estimating and quantifying uncertainty in unknown system parameters from limited data remains a challenging inverse problem in a variety of real-world applications. While many approaches focus on estimating constant parameters, a subset of these problems includes time-varying parameters with unknown evolution models that often cannot be directly observed. This work develops a systematic particle filtering approach that reframes the idea behind artificial parameter evolution to estimate time-varying parameters in nonstationary inverse problems arising from deterministic dynamical systems. Focusing on systems modeled by ordinary differential equations, we present two particle filter algorithms for time-varying parameter estimation: one that relies on a fixed value for the noise variance of a parameter random walk; another that employs online estimation of the parameter evolution noise variance along with the time-varying parameter of interest. Several computed examples demonstrate the capability of the proposed algorithms in estimating time-varying parameters with different underlying functional forms and different relationships with the system states (i.e. additive vs. multiplicative). 
    more » « less
  2. Abstract We consider Bayesian inference for large-scale inverse problems, where computational challenges arise from the need for repeated evaluations of an expensive forward model. This renders most Markov chain Monte Carlo approaches infeasible, since they typically require O ( 1 0 4 ) model runs, or more. Moreover, the forward model is often given as a black box or is impractical to differentiate. Therefore derivative-free algorithms are highly desirable. We propose a framework, which is built on Kalman methodology, to efficiently perform Bayesian inference in such inverse problems. The basic method is based on an approximation of the filtering distribution of a novel mean-field dynamical system, into which the inverse problem is embedded as an observation operator. Theoretical properties are established for linear inverse problems, demonstrating that the desired Bayesian posterior is given by the steady state of the law of the filtering distribution of the mean-field dynamical system, and proving exponential convergence to it. This suggests that, for nonlinear problems which are close to Gaussian, sequentially computing this law provides the basis for efficient iterative methods to approximate the Bayesian posterior. Ensemble methods are applied to obtain interacting particle system approximations of the filtering distribution of the mean-field model; and practical strategies to further reduce the computational and memory cost of the methodology are presented, including low-rank approximation and a bi-fidelity approach. The effectiveness of the framework is demonstrated in several numerical experiments, including proof-of-concept linear/nonlinear examples and two large-scale applications: learning of permeability parameters in subsurface flow; and learning subgrid-scale parameters in a global climate model. Moreover, the stochastic ensemble Kalman filter and various ensemble square-root Kalman filters are all employed and are compared numerically. The results demonstrate that the proposed method, based on exponential convergence to the filtering distribution of a mean-field dynamical system, is competitive with pre-existing Kalman-based methods for inverse problems. 
    more » « less
  3. Abstract

    Quasars are bright and unobscured active galactic nuclei (AGN) thought to be powered by the accretion of matter around supermassive black holes at the centers of galaxies. The temporal variability of a quasar’s brightness contains valuable information about its physical properties. The UV/optical variability is thought to be a stochastic process, often represented as a damped random walk described by a stochastic differential equation (SDE). Upcoming wide-field telescopes such as the Rubin Observatory Legacy Survey of Space and Time (LSST) are expected to observe tens of millions of AGN in multiple filters over a ten year period, so there is a need for efficient and automated modeling techniques that can handle the large volume of data. Latent SDEs are machine learning models well suited for modeling quasar variability, as they can explicitly capture the underlying stochastic dynamics. In this work, we adapt latent SDEs to jointly reconstruct multivariate quasar light curves and infer their physical properties such as the black hole mass, inclination angle, and temperature slope. Our model is trained on realistic simulations of LSST ten year quasar light curves, and we demonstrate its ability to reconstruct quasar light curves even in the presence of long seasonal gaps and irregular sampling across different bands, outperforming a multioutput Gaussian process regression baseline. Our method has the potential to provide a deeper understanding of the physical properties of quasars and is applicable to a wide range of other multivariate time series with missing data and irregular sampling.

     
    more » « less
  4. Abstract

    Many scientific problems focus on observed patterns of change or on how to design a system to achieve particular dynamics. Those problems often require fitting differential equation models to target trajectories. Fitting such models can be difficult because each evaluation of the fit must calculate the distance between the model and target patterns at numerous points along a trajectory. The gradient of the fit with respect to the model parameters can be challenging to compute. Recent technical advances in automatic differentiation through numerical differential equation solvers potentially change the fitting process into a relatively easy problem, opening up new possibilities to study dynamics. However, application of the new tools to real data may fail to achieve a good fit. This article illustrates how to overcome a variety of common challenges, using the classic ecological data for oscillations in hare and lynx populations. Models include simple ordinary differential equations (ODEs) and neural ordinary differential equations (NODEs), which use artificial neural networks to estimate the derivatives of differential equation systems. Comparing the fits obtained with ODEs versus NODEs, representing small and large parameter spaces, and changing the number of variable dimensions provide insight into the geometry of the observed and model trajectories. To analyze the quality of the models for predicting future observations, a Bayesian‐inspired preconditioned stochastic gradient Langevin dynamics (pSGLD) calculation of the posterior distribution of predicted model trajectories clarifies the tendency for various models to underfit or overfit the data. Coupling fitted differential equation systems with pSGLD sampling provides a powerful way to study the properties of optimization surfaces, raising an analogy with mutation‐selection dynamics on fitness landscapes.

     
    more » « less
  5. Abstract Rare events arising in nonlinear atmospheric dynamics remain hard to predict and attribute. We address the problem of forecasting rare events in a prototypical example, sudden stratospheric warmings (SSWs). Approximately once every other winter, the boreal stratospheric polar vortex rapidly breaks down, shifting midlatitude surface weather patterns for months. We focus on two key quantities of interest: the probability of an SSW occurring, and the expected lead time if it does occur, as functions of initial condition. These optimal forecasts concretely measure the event’s progress. Direct numerical simulation can estimate them in principle but is prohibitively expensive in practice: each rare event requires a long integration to observe, and the cost of each integration grows with model complexity. We describe an alternative approach using integrations that are short compared to the time scale of the warming event. We compute the probability and lead time efficiently by solving equations involving the transition operator, which encodes all information about the dynamics. We relate these optimal forecasts to a small number of interpretable physical variables, suggesting optimal measurements for forecasting. We illustrate the methodology on a prototype SSW model developed by Holton and Mass and modified by stochastic forcing. While highly idealized, this model captures the essential nonlinear dynamics of SSWs and exhibits the key forecasting challenge: the dramatic separation in time scales between a single event and the return time between successive events. Our methodology is designed to fully exploit high-dimensional data from models and observations, and has the potential to identify detailed predictors of many complex rare events in meteorology. 
    more » « less