Obtaining high certainty in predictive models is crucial for making informed and trustworthy decisions in many scientific and engineering domains. However, extensive experimentation required for model accuracy can be both costly and time-consuming. This paper presents an adaptive sampling approach designed to reduce epistemic uncertainty in predictive models. Our primary contribution is the development of a metric that estimates potential epistemic uncertainty leveraging prediction interval-generation neural networks.This estimation relies on the distance between the predicted upper and lower bounds and the observed data at the tested positions and their neighboring points. Our second contribution is the proposal of a batch sampling strategy based on Gaussian processes (GPs). A GP is used as a surrogate model of the networks trained at each iteration of the adaptive sampling process. Using this GP, we design an acquisition function that selects a combination of sampling locations to maximize the reduction of epistemic uncertainty across the domain.We test our approach on three unidimensional synthetic problems and a multi-dimensional dataset based on an agricultural field for selecting experimental fertilizer rates.The results demonstrate that our method consistently converges faster to minimum epistemic uncertainty levels compared to Normalizing Flows Ensembles, MC-Dropout, and simple GPs.
more »
« less
Dispersion‐enhanced sequential batch sampling for adaptive contour estimation
In computer simulation and optimal design, sequential batch sampling offers an appealing way to iteratively stipulate optimal sampling points based upon existing selections and efficiently construct surrogate modeling. Nonetheless, the issue of near duplicates poses tremendous quandary for sequential learning. It refers to the situation that selected critical points cluster together in each sampling batch, which are individually but not collectively informative towards the optimal design. Near duplicates severely diminish the computational efficiency as they barely contribute extra information towards update of the surrogate. To address this issue, we impose a dispersion criterion on concurrent selection of sampling points, which essentially forces a sparse distribution of critical points in each batch, and demonstrate the effectiveness of this approach in adaptive contour estimation. Specifically, we adopt Gaussian process surrogate to emulate the simulator, acquire variance reduction of the critical region from new sampling points as a dispersion criterion, and combine it with the modified expected improvement (EI) function for critical batch selection. The critical region here is the proximity of the contour of interest. This proposed approach is vindicated in numerical examples of a two‐dimensional four‐branch function, a four‐dimensional function with a disjoint contour of interest and a time‐delay dynamic system.
more »
« less
- Award ID(s):
- 2119334
- PAR ID:
- 10548185
- Publisher / Repository:
- Wiley
- Date Published:
- Journal Name:
- Quality and Reliability Engineering International
- Volume:
- 40
- Issue:
- 1
- ISSN:
- 0748-8017
- Page Range / eLocation ID:
- 131 to 144
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Bayesian optimization (BO) has well-documented merits for optimizing black-box functions with an expensive evaluation cost. Such functions emerge in applications as diverse as hyperparameter tuning, drug discovery, and robotics. BO hinges on a Bayesian surrogate model to sequentially select query points so as to balance exploration with exploitation of the search space. Most existing works rely on a single Gaussian process (GP) based surrogate model, where the kernel function form is typically preselected using domain knowledge. To bypass such a design process, this paper leverages an ensemble (E) of GPs to adaptively select the surrogate model fit on-the-fly, yielding a GP mixture posterior with enhanced expressiveness for the sought function. Acquisition of the next evaluation input using this EGP-based function posterior is then enabled by Thompson sampling (TS) that requires no additional design parameters. To endow function sampling with scalability, random feature-based kernel approximation is leveraged per GP model. The novel EGP-TS readily accommodates parallel operation. To further establish convergence of the proposed EGP-TS to the global optimum, analysis is conducted based on the notion of Bayesian regret for both sequential and parallel settings. Tests on synthetic functions and real-world applications showcase the merits of the proposed method.more » « less
-
Abstract In this study, we carry out robust optimal design for the machining operations, one key process in wafer polishing in chip manufacturing, aiming to avoid the peculiar regenerative chatter and maximize the material removal rate (MRR) considering the inherent material and process uncertainty. More specifically, we characterize the cutting tool dynamics using a delay differential equation (DDE) and enlist the temporal finite element method (TFEM) to derive its approximate solution and stability index given process settings or design variables. To further quantify the inherent uncertainty, replications of TFEM under different realizations of random uncontrollable variables are performed, which however incurs extra computational burden. To eschew the deployment of such a crude Monte Carlo (MC) approach at each design setting, we integrate the stochastic TFEM with a stochastic surrogate model, stochastic kriging, in an active learning framework to sequentially approximate the stability boundary. The numerical result suggests that the nominal stability boundary attained from this method is on par with that from the crude MC, but only demands a fraction of the computational overhead. To further ensure the robustness of process stability, we adopt another surrogate, the Gaussian process, to predict the variance of the stability index at unexplored design points and identify the robust stability boundary per the conditional value at risk (CVaR) criterion. Therefrom, an optimal design in the robust stable region that maximizes the MRR can be identified.more » « less
-
Feng, B.; Pedrielli, G; Peng, Y.; Shashaani, S.; Song, E.; Corlu, C.; Lee, L.; Chew, E.; Roeder, T.; Lendermann, P. (Ed.)The Rapid Gaussian Markov Improvement Algorithm (rGMIA) solves discrete optimization via simulation problems by using a Gaussian Markov random field and complete expected improvement as the sampling and stopping criterion. rGMIA has been created as a sequential sampling procedure run on a single processor. In this paper, we extend rGMIA to a parallel computing environment when q+1 solutions can be simulated in parallel. To this end, we introduce the q-point complete expected improvement criterion to determine a batch of q+1 solutions to simulate. This new criterion is implemented in a new object-oriented rGMIA package.more » « less
-
We present a consensus-based framework that unifies phase space exploration with posterior-residual-based adaptive sampling for surrogate construction in high-dimensional energy landscapes. Unlike standard approximation tasks where sampling points can be freely queried, systems with complex energy landscapes such as molecular dynamics (MD) do not have direct access to arbitrary sampling regions due to the physical constraints and energy barriers; the surrogate construction further relies on the dynamical exploration of phase space, posing a significant numerical challenge. We formulate the problem as a minimax optimization that jointly adapts both the surrogate approximation and residual-enhanced sampling. The construction of free energy surfaces (FESs) for high-dimensional collective variables (CVs) of MD systems is used as a motivating example to illustrate the essential idea. Specifically, the maximization step establishes a stochastic interacting particle system to impose adaptive sampling through both exploitation of a Laplace approximation of the max-residual region and exploration of uncharted phase space via temperature control. The minimization step updates the FES surrogate with the new sample set. Numerical results demonstrate the effectiveness of the present approach for biomolecular systems with up to 30 CVs. While we focus on the FES construction, the developed framework is general for efficient surrogate construction for complex systems with high-dimensional energy landscapes.more » « less
An official website of the United States government

