Elastic continuum mechanical models are widely used to compute deformations due to pressure changes in buried cavities, such as magma reservoirs. In general, analytical models are fast but can be inaccurate as they do not correctly satisfy boundary conditions for many geometries, while numerical models are slow and may require specialized expertise and software. To overcome these limitations, we trained supervised machine learning emulators (model surrogates) based on parallel partial Gaussian processes which predict the output of a finite element numerical model with high fidelity but >1,000× greater computational efficiency. The emulators are based on generalized nondimensional forms of governing equations for finite non‐dipping spheroidal cavities in elastic halfspaces. Either cavity volume change or uniform pressure change boundary conditions can be specified, and the models predict both surface displacements and cavity (pore) compressibility. Because of their computational efficiency, using the emulators as numerical model surrogates can greatly accelerate data inversion algorithms such as those employing Bayesian Markov chain Monte Carlo sampling. The emulators also permit a comprehensive evaluation of how displacements and cavity compressibility vary with geometry and material properties, revealing the limitations of analytical models. Our open‐source emulator code can be utilized without finite element software, is suitable for a wide range of cavity geometries and depths, includes an estimate of uncertainties associated with emulation, and can be used to train new emulators for different source geometries.
more »
« less
Statistical mechanics in climate emulation: Challenges and perspectives
Abstract Climate emulators are a powerful instrument for climate modeling, especially in terms of reducing the computational load for simulating spatiotemporal processes associated with climate systems. The most important type of emulators are statistical emulators trained on the output of an ensemble of simulations from various climate models. However, such emulators oftentimes fail to capture the “physics” of a system that can be detrimental for unveiling critical processes that lead to climate tipping points. Historically, statistical mechanics emerged as a tool to resolve the constraints on physics using statistics. We discuss how climate emulators rooted in statistical mechanics and machine learning can give rise to new climate models that are more reliable and require less observational and computational resources. Our goal is to stimulate discussion on how statistical climate emulators can further be improved with the help of statistical mechanics which, in turn, may reignite the interest of statistical community in statistical mechanics of complex systems.
more »
« less
- Award ID(s):
- 2102906
- PAR ID:
- 10403008
- Date Published:
- Journal Name:
- Environmental Data Science
- Volume:
- 1
- ISSN:
- 2634-4602
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Modern climate projections lack adequate spatial and temporal resolution due to computational constraints. A consequence is inaccurate and imprecise predictions of critical processes such as storms. Hybrid methods that combine physics with machine learning (ML) have introduced a new generation of higher fidelity climate simulators that can sidestep Moore's Law by outsourcing compute-hungry, short, high-resolution simulations to ML emulators. However, this hybrid ML-physics simulation approach requires domain-specific treatment and has been inaccessible to ML experts because of lack of training data and relevant, easy-to-use workflows. We present ClimSim, the largest-ever dataset designed for hybrid ML-physics research. It comprises multi-scale climate simulations, developed by a consortium of climate scientists and ML researchers. It consists of 5.7 billion pairs of multivariate input and output vectors that isolate the influence of locally-nested, high-resolution, high-fidelity physics on a host climate simulator's macro-scale physical state.The dataset is global in coverage, spans multiple years at high sampling frequency, and is designed such that resulting emulators are compatible with downstream coupling into operational climate simulators. We implement a range of deterministic and stochastic regression baselines to highlight the ML challenges and their scoring. The data (https://huggingface.co/datasets/LEAP/ClimSim_high-res) and code (https://leap-stc.github.io/ClimSim) are released openly to support the development of hybrid ML-physics and high-fidelity climate simulations for the benefit of science and society.more » « less
-
A statistical emulator can be used as a surrogate of complex physics-based calculations to drastically reduce the computational cost. Its successful implementation hinges on an accurate representation of the nonlinear response surface with a high-dimensional input space. Conventional “space-filling” designs, including random sampling and Latin hypercube sampling, become inefficient as the dimensionality of the input variables increases, and the predictive accuracy of the emulator can degrade substantially for a test input distant from the training input set. To address this fundamental challenge, we develop a reliable emulator for predicting complex functionals by active learning with error control (ALEC). The algorithm is applicable to infinite-dimensional mapping with high-fidelity predictions and a controlled predictive error. The computational efficiency has been demonstrated by emulating the classical density functional theory (cDFT) calculations, a statistical-mechanical method widely used in modeling the equilibrium properties of complex molecular systems. We show that ALEC is much more accurate than conventional emulators based on the Gaussian processes with “space-filling” designs and alternative active learning methods. In addition, it is computationally more efficient than direct cDFT calculations. ALEC can be a reliable building block for emulating expensive functionals owing to its minimal computational cost, controllable predictive error, and fully automatic features.more » « less
-
The BUQEYE collaboration (Bayesian Uncertainty Quantification: Errors in Your effective field theory) presents a pedagogical introduction to projection-based, reduced-order emulators for applications in low-energy nuclear physics. The term emulator refers here to a fast surrogate model capable of reliably approximating high-fidelity models. As the general tools employed by these emulators are not yet well-known in the nuclear physics community, we discuss variational and Galerkin projection methods, emphasize the benefits of offline-online decompositions, and explore how these concepts lead to emulators for bound and scattering systems that enable fast and accurate calculations using many different model parameter sets. We also point to future extensions and applications of these emulators for nuclear physics, guided by the mature field of model (order) reduction. All examples discussed here and more are available as interactive, open-source Python code so that practitioners can readily adapt projection-based emulators for their own work.more » « less
-
Summary A variety of demographic statistical models exist for studying population dynamics when individuals can be tracked over time. In cases where data are missing due to imperfect detection of individuals, the associated measurement error can be accommodated under certain study designs (e.g. those that involve multiple surveys or replication). However, the interaction of the measurement error and the underlying dynamic process can complicate the implementation of statistical agent‐based models (ABMs) for population demography. In a Bayesian setting, traditional computational algorithms for fitting hierarchical demographic models can be prohibitively cumbersome to construct. Thus, we discuss a variety of approaches for fitting statistical ABMs to data and demonstrate how to use multi‐stage recursive Bayesian computing and statistical emulators to fit models in such a way that alleviates the need to have analytical knowledge of the ABM likelihood. Using two examples, a demographic model for survival and a compartment model for COVID‐19, we illustrate statistical procedures for implementing ABMs. The approaches we describe are intuitive and accessible for practitioners and can be parallelised easily for additional computational efficiency.more » « less
An official website of the United States government

