skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Batched Stochastic Bayesian Optimization via Combinatorial Constraints Design
In many high-throughput experimental design settings, such as those common in biochemical engineering, batched queries are often more cost effective than one-by-one sequential queries. Furthermore, it is often not possible to directly choose items to query. Instead, the experimenter specifies a set of constraints that generates a library of possible items, which are then selected stochastically. Motivated by these considerations, we investigate Batched Stochastic Bayesian Optimization (BSBO), a novel Bayesian optimization scheme for choosing the constraints in order to guide exploration towards items with greater utility. We focus on site saturation mutagenesis, a prototypical setting of BSBO in biochemical engineering, and propose a natural objective function for this problem. Importantly, we show that our objective function can be efficiently decomposed as a difference of submodular functions (DS), which allows us to employ DS optimization tools to greedily identify sets of constraints that increase the likelihood of finding items with high utility. Our experimental results show that our algorithm outperforms common heuristics on both synthetic and two real protein datasets.  more » « less
Award ID(s):
1645832
PAR ID:
10207052
Author(s) / Creator(s):
Editor(s):
Chaudhuri, K
Date Published:
Journal Name:
Proceedings of Machine Learning Research
Volume:
89
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Current methods of finding optimal experimental conditions, Edisonian systematic searches, often inefficiently evaluate suboptimal design points and require fine resolution to identify near optimal conditions. For expensive experimental campaigns or those with large design spaces, the shortcomings of the status quo approaches are more significant. Here, we extend Bayesian optimization (BO) and introduce a chemically-informed data-driven optimization (ChIDDO) approach. This approach uses inexpensive and low-fidelity information obtained from physical models of chemical processes and subsequently combines it with expensive and high-fidelity experimental data to optimize a common objective function. Using common optimization benchmark objective functions, we describe scenarios in which the ChIDDO algorithm outperforms the traditional BO approach, and then implement the algorithm on a simulated electrochemical engineering optimization problem. 
    more » « less
  2. Optimizing objectives under constraints, where both the objectives and constraints are black box functions, is a common scenario in real-world applications such as the design of medical therapies, industrial process optimization, and hyperparameter optimization. One popular approach to handle these complex scenarios is Bayesian Optimization (BO). However, when it comes to the theoretical understanding of constrained Bayesian optimization (CBO), the existing framework often relies on heuristics, approximations, or relaxation of objectives and, therefore, lacks the same level of theoretical guarantees as in canonical BO. In this paper, we exclude the boundary candidates that could be compromised by noise perturbation and aim to find the interior optimum of the black-box-constrained objective. We rely on the insight that optimizing the objective and learning the constraints can both help identify the high-confidence regions of interest (ROI) that potentially contain the interior optimum. We propose an efficient CBO framework that intersects the ROIs identified from each aspect on a discretized search space to determine the general ROI. Then, on the ROI, we optimize the acquisition functions, balancing the learning of the constraints and the optimization of the objective. We showcase the efficiency and robustness of our proposed CBO framework through the high probability regret bounds for the algorithm and extensive empirical evidence. 
    more » « less
  3. Bayesian optimization is a coherent, ubiquitous approach to decision-making under uncertainty, with applications including multi-arm bandits, active learning, and black-box optimization. Bayesian optimization selects decisions (i.e. objective function queries) with maximal expected utility with respect to the posterior distribution of a Bayesian model, which quantifies reducible, epistemic uncertainty about query outcomes. In practice, subjectively implausible outcomes can occur regularly for two reasons: 1) model misspecification and 2) covariate shift. Conformal prediction is an uncertainty quantification method with coverage guarantees even for misspecified models and a simple mechanism to correct for covariate shift. We propose conformal Bayesian optimization, which directs queries towards regions of search space where the model predictions have guaranteed validity, and investigate its behavior on a suite of black-box optimization tasks and tabular ranking tasks. In many cases we find that query coverage can be significantly improved without harming sample-efficiency. 
    more » « less
  4. Multi-objective Bayesian optimization has been widely adopted in scientific experiment design, including drug discovery and hyperparameter optimization. In practice, regulatory or safety concerns often impose additional thresholds on certain attributes of the experimental outcomes. Previous work has primarily focused on constrained single-objective optimization tasks or active search under constraints. The existing constrained multi-objective algorithms address the issue with heuristics and approximations, posing challenges to the analysis of the sample efficiency. We propose a novel constrained multi-objective Bayesian optimization algorithm COMBOO that balances active learning of the level-set defined on multiple unknowns with multi-objective optimization within the feasible region. We provide both theoretical analysis and empirical evidence, demonstrating the efficacy of our approach on various synthetic benchmarks and real-world applications. 
    more » « less
  5. Multi-fidelity Bayesian optimization (MFBO) is a powerful approach that utilizes low-fidelity, cost-effective sources to expedite the exploration and exploitation of a high-fidelity objective function. Existing MFBO methods with theoretical foundations either lack justification for performance improvements over single-fidelity optimization or rely on strong assumptions about the relationships between fidelity sources to construct surrogate models and direct queries to low-fidelity sources. To mitigate the dependency on cross-fidelity assumptions while maintaining the advantages of low-fidelity queries, we introduce a random sampling and partition-based MFBO framework with deep kernel learning. This framework is robust to cross-fidelity model misspecification and explicitly illustrates the benefits of low-fidelity queries. Our results demonstrate that the proposed algorithm effectively manages complex cross-fidelity relationships and efficiently optimizes the target fidelity function. 
    more » « less