Abstract Bayesian optimization (BO) is an indispensable tool to optimize objective functions that either do not have known functional forms or are expensive to evaluate. Currently, optimal experimental design is always conducted within the workflow of BO leading to more efficient exploration of the design space compared to traditional strategies. This can have a significant impact on modern scientific discovery, in particular autonomous materials discovery, which can be viewed as an optimization problem aimed at looking for the maximum (or minimum) point for the desired materials properties. The performance of BO-based experimental design depends not only on the adopted acquisition function but also on the surrogate models that help to approximate underlying objective functions. In this paper, we propose a fully autonomous experimental design framework that uses more adaptive and flexible Bayesian surrogate models in a BO procedure, namely Bayesian multivariate adaptive regression splines and Bayesian additive regression trees. They can overcome the weaknesses of widely used Gaussian process-based methods when faced with relatively high-dimensional design space or non-smooth patterns of objective functions. Both simulation studies and real-world materials science case studies demonstrate their enhanced search efficiency and robustness.
more »
« less
Chemically-informed data-driven optimization (ChIDDO): leveraging physical models and Bayesian learning to accelerate chemical research
Current methods of finding optimal experimental conditions, Edisonian systematic searches, often inefficiently evaluate suboptimal design points and require fine resolution to identify near optimal conditions. For expensive experimental campaigns or those with large design spaces, the shortcomings of the status quo approaches are more significant. Here, we extend Bayesian optimization (BO) and introduce a chemically-informed data-driven optimization (ChIDDO) approach. This approach uses inexpensive and low-fidelity information obtained from physical models of chemical processes and subsequently combines it with expensive and high-fidelity experimental data to optimize a common objective function. Using common optimization benchmark objective functions, we describe scenarios in which the ChIDDO algorithm outperforms the traditional BO approach, and then implement the algorithm on a simulated electrochemical engineering optimization problem.
more »
« less
- PAR ID:
- 10319789
- Date Published:
- Journal Name:
- Reaction Chemistry & Engineering
- ISSN:
- 2058-9883
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Bayesian optimization (BO) is a sequential optimization strategy that is increasingly employed in a wide range of areas including materials design. In real world applications, acquiring high-fidelity (HF) data through physical experiments or HF simulations is the major cost component of BO. To alleviate this bottleneck, multi-fidelity (MF) methods are increasingly used to forgo the sole reliance on the expensive HF data and reduce the sampling costs by querying inexpensive low-fidelity (LF) sources whose data are correlated with HF samples. Existing multi-fidelity BO (MFBO) methods operate under the following two assumptions: (1) Leveraging global (rather than local) correlation between HF and LF sources, and (2) Associating all the data sources with the same noise process. These assumptions dramatically reduce the performance of MFBO when LF sources are only locally correlated with the HF source or when the noise variance varies across the data sources. To dispense with these incorrect assumptions, we propose an MF emulation method that (1) learns a noise model for each data source, and (2) enables BO to leverage highly biased LF sources which are only locally correlated with the HF source. We illustrate the performance of our method through analytical examples and engineering problems on materials design.more » « less
-
Abstract The design of materials and identification of optimal processing parameters constitute a complex and challenging task, necessitating efficient utilization of available data. Bayesian Optimization (BO) has gained popularity in materials design due to its ability to work with minimal data. However, many BO-based frameworks predominantly rely on statistical information, in the form of input-output data, and assume black-box objective functions. In practice, designers often possess knowledge of the underlying physical laws governing a material system, rendering the objective function not entirely black-box, as some information is partially observable. In this study, we propose a physics-informed BO approach that integrates physics-infused kernels to effectively leverage both statistical and physical information in the decision-making process. We demonstrate that this method significantly improves decision-making efficiency and enables more data-efficient BO. The applicability of this approach is showcased through the design of NiTi shape memory alloys, where the optimal processing parameters are identified to maximize the transformation temperature.more » « less
-
High-dimensional Bayesian optimization (BO) tasks such as molecular design often require > 10,000 function evaluations before obtaining meaningful results. While methods like sparse variational Gaussian processes (SVGPs) reduce computational requirements in these settings, the underlying approximations result in suboptimal data acquisitions that slow the progress of optimization. In this paper we modify SVGPs to better align with the goals of BO: targeting informed data acquisition rather than global posterior fidelity. Using the framework of utility-calibrated variational inference, we unify GP approximation and data acquisition into a joint optimization problem, thereby ensuring optimal decisions under a limited computational budget. Our approach can be used with any decision-theoretic acquisition function and is compatible with trust region methods like TuRBO. We derive efficient joint objectives for the expected improvement and knowledge gradient acquisition functions in both the standard and batch BO settings. Our approach outperforms standard SVGPs on high-dimensional benchmark tasks in control and molecular design.more » « less
-
Bayesian optimization (BO) is a powerful paradigm for optimizing expensive black-box functions. Traditional BO methods typically rely on separate hand-crafted acquisition functions and surrogate models for the underlying function, and often operate in a myopic manner. In this paper, we propose a novel direct regret optimization approach that jointly learns the optimal model and non-myopic acquisition by distilling from a set of candidate models and acquisitions, and explicitly targets minimizing the multi-step regret. Our framework leverages an ensemble of Gaussian Processes (GPs) with varying hyperparameters to generate simulated BO trajectories, each guided by an acquisition function chosen from a pool of conventional choices, until a Bayesian early stop criterion is met. These simulated trajectories, capturing multi-step exploration strategies, are used to train an end-to-end decision transformer that directly learns to select next query points aimed at improving the ultimate objective. We further adopt a dense training–sparse learning paradigm: The decision transformer is trained offline with abundant simulated data sampled from ensemble GPs and acquisitions, while a limited number of real evaluations refine the GPs online. Experimental results on synthetic and real-world benchmarks suggest that our method consistently outperforms BO baselines, achieving lower simple regret and demonstrating more robust exploration in high-dimensional or noisy settings.more » « less
An official website of the United States government

