skip to main content


Title: A surrogate-based approach to nonlinear, non-Gaussian joint state-parameter data assimilation

Many recent advances in sequential assimilation of data into nonlinear high-dimensional models are modifications to particle filters which employ efficient searches of a high-dimensional state space. In this work, we present a complementary strategy that combines statistical emulators and particle filters. The emulators are used to learn and offer a computationally cheap approximation to the forward dynamic mapping. This emulator-particle filter (Emu-PF) approach requires a modest number of forward-model runs, but yields well-resolved posterior distributions even in non-Gaussian cases. We explore several modifications to the Emu-PF that utilize mechanisms for dimension reduction to efficiently fit the statistical emulator, and present a series of simulation experiments on an atypical Lorenz-96 system to demonstrate their performance. We conclude with a discussion on how the Emu-PF can be paired with modern particle filtering algorithms.

 
more » « less
Award ID(s):
1821338
NSF-PAR ID:
10289563
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Foundations of Data Science
Volume:
0
Issue:
0
ISSN:
2639-8001
Page Range / eLocation ID:
0
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. A statistical emulator can be used as a surrogate of complex physics-based calculations to drastically reduce the computational cost. Its successful implementation hinges on an accurate representation of the nonlinear response surface with a high-dimensional input space. Conventional “space-filling” designs, including random sampling and Latin hypercube sampling, become inefficient as the dimensionality of the input variables increases, and the predictive accuracy of the emulator can degrade substantially for a test input distant from the training input set. To address this fundamental challenge, we develop a reliable emulator for predicting complex functionals by active learning with error control (ALEC). The algorithm is applicable to infinite-dimensional mapping with high-fidelity predictions and a controlled predictive error. The computational efficiency has been demonstrated by emulating the classical density functional theory (cDFT) calculations, a statistical-mechanical method widely used in modeling the equilibrium properties of complex molecular systems. We show that ALEC is much more accurate than conventional emulators based on the Gaussian processes with “space-filling” designs and alternative active learning methods. In addition, it is computationally more efficient than direct cDFT calculations. ALEC can be a reliable building block for emulating expensive functionals owing to its minimal computational cost, controllable predictive error, and fully automatic features. 
    more » « less
  2. Abstract

    Iterative ensemble filters and smoothers are now commonly used for geophysical models. Some of these methods rely on a factorization of the observation likelihood function to sample from a posterior density through a set of “tempered” transitions to ensemble members. For Gaussian‐based data assimilation methods, tangent linear versions of nonlinear operators can be relinearized between iterations, thus leading to a solution that is less biased than a single‐step approach. This study adopts similar iterative strategies for a localized particle filter (PF) that relies on the estimation of moments to adjust unobserved variables based on importance weights. This approach builds off a “regularization” of the local PF, which forces weights to be more uniform through heuristic means. The regularization then leads to an adaptive tempering, which can also be combined with filter updates from parametric methods, such as ensemble Kalman filters. The role of iterations is analyzed by deriving the localized posterior probability density assumed by current local PF formulations and then examining how single‐step and tempered PFs sample from this density. From experiments performed with a low‐dimensional nonlinear system, the iterative and hybrid strategies show the largest benefits in observation‐sparse regimes, where only a few particles contain high likelihoods and prior errors are non‐Gaussian. This regime mimics specific applications in numerical weather prediction, where small ensemble sizes, unresolved model error, and highly nonlinear dynamics lead to prior uncertainty that is larger than measurement uncertainty.

     
    more » « less
  3. Abstract

    Ideally, probabilistic hazard assessments combine available knowledge about physical mechanisms of the hazard, data on past hazards, and any precursor information. Systematically assessing the probability of rare, yet catastrophic hazards adds a layer of difficulty due to limited observation data. Via computer models, one can exercise potentially dangerous scenarios that may not have happened in the past but are probabilistically consistent with the aleatoric nature of previous volcanic behavior in the record. Traditional Monte Carlo‐based methods to calculate such hazard probabilities suffer from two issues: they are computationally expensive, and they are static. In light of new information, newly available data, signs of unrest, and new probabilistic analysis describing uncertainty about scenarios the Monte Carlo calculation would need to be redone under the same computational constraints. Here we present an alternative approach utilizing statistical emulators that provide an efficient way to overcome the computational bottleneck of typical Monte Carlo approaches. Moreover, this approach is independent of an aleatoric scenario model and yet can be applied rapidly to any scenario model making it dynamic. We present and apply this emulator‐based approach to create multiple probabilistic hazard maps for inundation of pyroclastic density currents in the Long Valley Volcanic Region. Further, we illustrate how this approach enables an exploration of the impact of epistemic uncertainties on these probabilistic hazard forecasts. Particularly, we focus on the uncertainty of vent opening models and how that uncertainty both aleatoric and epistemic impacts the resulting probabilistic hazard maps of pyroclastic density current inundation.

     
    more » « less
  4. null (Ed.)
    Statistical emulators are a key tool for rapidly producing probabilistic hazard analysis of geophysical processes. Given output data computed for a relatively small number of parameter inputs, an emulator interpolates the data, providing the expected value of the output at untried inputs and an estimate of error at that point. In this work, we propose to fit Gaussian Process emulators to the output from a volcanic ash transport model, Ash3d. Our goal is to predict the simulated volcanic ash thickness from Ash3d at a location of interest using the emulator. Our approach is motivated by two challenges to fitting emulators—characterizing the input wind field and interactions between that wind field and variable grain sizes. We resolve these challenges by using physical knowledge on tephra dispersal. We propose new physically motivated variables as inputs and use normalized output as the response for fitting the emulator. Subsetting based on the initial conditions is also critical in our emulator construction. Simulation studies characterize the accuracy and efficiency of our emulator construction and also reveal its current limitations. Our work represents the first emulator construction for volcanic ash transport models with considerations of the simulated physical process. 
    more » « less
  5. We discuss emulators from the ab initio symmetry-adapted no-core shell-model framework for studying the formation of alpha clustering and collective properties without effective charges. We present a new type of an emulator, one that utilizes the eigenvector continuation technique but is based on the use of symplectic symmetry considerations. This is achieved by using physically relevant degrees of freedom, namely, the symmetry-adapted basis, which exploits the almost perfect symplectic symmetry in nuclei. Specifically, we study excitation energies, point-proton root-mean-square radii, along with electric quadrupole moments and transitions for 6 Li and 12 C. We show that the set of parameterizations of the chiral potential used to train the emulators has no significant effect on predictions of dominant nuclear features, such as shape and the associated symplectic symmetry, along with cluster formation, but slightly varies details that affect collective quadrupole moments, asymptotic normalization coefficients, and alpha partial widths up to a factor of two. This makes these types of emulators important for further constraining the nuclear force for high-precision nuclear structure and reaction observables. 
    more » « less