skip to main content


Title: Composite grid designs for adaptive computer experiments with fast inference
Summary Experiments are often used to produce emulators of deterministic computer code. This article introduces composite grid experimental designs and a sequential method for building the designs for accurate emulation. Computational methods are developed that enable fast and exact Gaussian process inference even with large sample sizes. We demonstrate that the proposed approach can produce emulators that are orders of magnitude more accurate than current approximations at a comparable computational cost.  more » « less
Award ID(s):
1953111
NSF-PAR ID:
10280207
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Biometrika
ISSN:
0006-3444
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. A statistical emulator can be used as a surrogate of complex physics-based calculations to drastically reduce the computational cost. Its successful implementation hinges on an accurate representation of the nonlinear response surface with a high-dimensional input space. Conventional “space-filling” designs, including random sampling and Latin hypercube sampling, become inefficient as the dimensionality of the input variables increases, and the predictive accuracy of the emulator can degrade substantially for a test input distant from the training input set. To address this fundamental challenge, we develop a reliable emulator for predicting complex functionals by active learning with error control (ALEC). The algorithm is applicable to infinite-dimensional mapping with high-fidelity predictions and a controlled predictive error. The computational efficiency has been demonstrated by emulating the classical density functional theory (cDFT) calculations, a statistical-mechanical method widely used in modeling the equilibrium properties of complex molecular systems. We show that ALEC is much more accurate than conventional emulators based on the Gaussian processes with “space-filling” designs and alternative active learning methods. In addition, it is computationally more efficient than direct cDFT calculations. ALEC can be a reliable building block for emulating expensive functionals owing to its minimal computational cost, controllable predictive error, and fully automatic features. 
    more » « less
  2. Summary

    A variety of demographic statistical models exist for studying population dynamics when individuals can be tracked over time. In cases where data are missing due to imperfect detection of individuals, the associated measurement error can be accommodated under certain study designs (e.g. those that involve multiple surveys or replication). However, the interaction of the measurement error and the underlying dynamic process can complicate the implementation of statistical agent‐based models (ABMs) for population demography. In a Bayesian setting, traditional computational algorithms for fitting hierarchical demographic models can be prohibitively cumbersome to construct. Thus, we discuss a variety of approaches for fitting statistical ABMs to data and demonstrate how to use multi‐stage recursive Bayesian computing and statistical emulators to fit models in such a way that alleviates the need to have analytical knowledge of the ABM likelihood. Using two examples, a demographic model for survival and a compartment model for COVID‐19, we illustrate statistical procedures for implementing ABMs. The approaches we describe are intuitive and accessible for practitioners and can be parallelised easily for additional computational efficiency.

     
    more » « less
  3. Abstract

    The Antarctic ice sheet (AIS) will be a dominant contributor to global mean sea level rise in the 21st century but remains a major source of uncertainty. The Ice Sheet Model Intercomparison for CMIP6 (ISMIP6) is an ensemble of continental‐scale models for studying the evolution of the AIS and projecting its future contribution to sea level. Due to their complexity and computational cost, ISMIP6 simulations are sparse and generated infrequently. Emulators are smaller‐scale models that approximate ISMs and enable experimentation and exploration into the drivers of sea level change. We introduce a neural network (NN) emulator to approximate the ISMIP6 ensemble, using a variational Long Short‐Term Memory (LSTM) with Monte Carlo dropout to quantify single‐projection uncertainty. The proposed NN emulator is compared to a Gaussian Process (GP) emulator on four criteria: accuracy of point estimates and predictive distributions of individual model projections, approximation of the ensemble projections, and model training time. The NN predicts more accurately on single projections, with a mean absolute error of 0.46 mm Sea Level Equivalent (SLE) versus 0.73 mm SLE for the GP, and has more accurate uncertainty estimates. The NN emulator also better approximates the ensemble distribution of ISMIP6 model projections, with a Kullback‐Leibler divergence of 18.26 versus 199.14 for GP at the projection year 2100. The NN enables more accurate experimentation with a reduced runtime, offering a new tool for understanding the important role of regional precipitation, ice sheet drainage systems, and interannual and longer timescale dynamics.

     
    more » « less
  4. Abstract

    We construct accurate emulators for the projected and redshift space galaxy correlation functions and excess surface density as measured by galaxy–galaxy lensing, based on halo occupation distribution modeling. Using the complete Mira-Titan suite of 111N-body simulations, our emulators vary over eight cosmological parameters and include the effects of neutrino mass and dynamical dark energy. We demonstrate that our emulators are sufficiently accurate for the analysis of the Baryon Oscillation Spectroscopic Survey DR12 CMASS galaxy sample over the range 0.5 ≤r≤ 50h−1Mpc. Furthermore, we show that our emulators are capable of recovering unbiased cosmological constraints from realistic mock catalogs over the same range. Our mock catalog tests show the efficacy of combining small-scale galaxy–galaxy lensing with redshift space clustering and that we can constrain the growth rate andσ8to 7% and 4.5%, respectively, for a CMASS-like sample using only the measurements covered by our emulator. With the inclusion of a cosmic microwave background prior onH0, this reduces to a 2% measurement of the growth rate.

     
    more » « less
  5. Abstract Climate emulators are a powerful instrument for climate modeling, especially in terms of reducing the computational load for simulating spatiotemporal processes associated with climate systems. The most important type of emulators are statistical emulators trained on the output of an ensemble of simulations from various climate models. However, such emulators oftentimes fail to capture the “physics” of a system that can be detrimental for unveiling critical processes that lead to climate tipping points. Historically, statistical mechanics emerged as a tool to resolve the constraints on physics using statistics. We discuss how climate emulators rooted in statistical mechanics and machine learning can give rise to new climate models that are more reliable and require less observational and computational resources. Our goal is to stimulate discussion on how statistical climate emulators can further be improved with the help of statistical mechanics which, in turn, may reignite the interest of statistical community in statistical mechanics of complex systems. 
    more » « less