- Award ID(s):
- 1800435
- NSF-PAR ID:
- 10177038
- Date Published:
- Journal Name:
- Physical Chemistry Chemical Physics
- Volume:
- 22
- Issue:
- 9
- ISSN:
- 1463-9076
- Page Range / eLocation ID:
- 5211 to 5219
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
The design of machine learning systems often requires trading off different objectives, for example, prediction error and energy consumption for deep neural networks (DNNs). Typically, no single design performs well in all objectives; therefore, finding Pareto-optimal designs is of interest. The search for Pareto-optimal designs involves evaluating designs in an iterative process, and the measurements are used to evaluate an acquisition function that guides the search process. However, measuring different objectives incurs different costs. For example, the cost of measuring the prediction error of DNNs is orders of magnitude higher than that of measuring the energy consumption of a pre-trained DNN as it requires re-training the DNN. Current state-of-the-art methods do not consider this difference in objective evaluation cost, potentially incurring expensive evaluations of objective functions in the optimization process. In this paper, we develop a novel decoupled and cost-aware multi-objective optimization algorithm, which we call Flexible Multi-Objective Bayesian Optimization (FlexiBO) to address this issue. For evaluating each design, FlexiBO selects the objective with higher relative gain by weighting the improvement of the hypervolume of the Pareto region with the measurement cost of each objective. This strategy, therefore, balances the expense of collecting new information with the knowledge gained through objective evaluations, preventing FlexiBO from performing expensive measurements for little to no gain. We evaluate FlexiBO on seven state-of-the-art DNNs for image recognition, natural language processing (NLP), and speech-to-text translation. Our results indicate that, given the same total experimental budget, FlexiBO discovers designs with 4.8% to 12.4% lower hypervolume error than the best method in state-of-the-art multi-objective optimization.more » « less
-
Optimizing a black-box function that is expensive to evaluate emerges in a gamut of machine learning and artifcial intelligence applications including drug discovery, policy optimization in robotics, and hyperparameter tuning of learning models to list a few. Bayesian optimization (BO) provides a principled framework to fnd the global optimum of such functions using a limited number of function evaluations. BO relies on a statistical surrogate model to actively select new query points, that is typically captured by a Gaussian process (GP). Unlike most existing approaches that hinge on a single GP surrogate model with a pre-selected kernel function that may confne the expressiveness of the sought function especially under the limited evaluation budget, the present work puts forth a weighted ensemble of GPs as a surrogate model. Building on the advocated Gaussian mixture (GM) posterior, the EGP framework adapts to the most ftted surrogate model as data arrive on-the-fy, offering a richer function space. For the acquisition of next evaluation points, the EGP-based posterior is coupled with an adaptive expected improvement (EI) criterion to balance exploration and exploitation of the search space. Numerical tests on a set of benchmark synthetic functions and two robotic tasks, demonstrate the impressive benefts of the proposed approach.more » « less
-
The Pareto-optimal frontier for a bi-objective search problem instance consists of all solutions that are not worse than any other solution in both objectives. The size of the Pareto-optimal frontier can be exponential in the size of the input graph, and hence finding it can be hard. Some existing works leverage a user-specified approximation factor ε to compute an approximate Pareto-optimal frontier that can be significantly smaller than the Pareto-optimal frontier. In this paper, we propose an anytime approximate bi-objective search algorithm, called Anytime Bi-Objective A*-ε (A-BOA*ε). A-BOA*ε is useful when deliberation time is limited. It first finds an approximate Pareto-optimal frontier quickly, iteratively improves it while time allows, and eventually finds the Pareto-optimal frontier. It efficiently reuses the search effort from previous iterations and makes use of a novel pruning technique. Our experimental results show that A-BOA*ε substantially outperforms baseline algorithms that do not reuse previous search effort, both in terms of runtime and number of node expansions. In fact, the most advanced variant of A-BOA*ε even slightly outperforms BOA*, a state-of-the-art bi-objective search algorithm, for finding the Pareto-optimal frontier. Moreover, given only a limited amount of deliberation time, A-BOA*ε finds solutions that collectively approximate the Pareto-optimal frontier much better than the solutions found by BOA*.more » « less
-
We introduce Ordalia, a novel approach for speeding up deep learning hyperparameter optimization search through early-pruning of less promising configurations. Our method leverages empirical and theoretical results characterizing the shape of the generalization error curve for increasing training data size and number of epochs. We show that with relatively small computational resources one can estimate the dominant parameters of neural networks' learning curves to obtain consistently good evaluations of their learning process to reliably early-eliminate non-promising configurations. By iterating this process with increasing training resources Ordalia rapidly converges to a small candidate set that includes many of the most promising configurations. We compare the performance of Ordalia with Hyperband, the state-of-the-art model-free hyperparameter optimization algorithm, and show that Ordalia consistently outperforms it on a variety of deep learning tasks. Ordalia conservative use of computational resources and ability to evaluate neural networks learning progress leads to a much better exploration and coverage of the search space, which ultimately produces superior neural network configurations.more » « less
-
Abstract Methods based on Gaussian stochastic process (GSP) models and expected improvement (EI) functions have been promising for box-constrained expensive optimization problems. These include robust design problems with environmental variables having set-type constraints. However, the methods that combine GSP and EI sub-optimizations suffer from the following problem, which limits their computational performance. Efficient global optimization (EGO) methods often repeat the same or nearly the same experimental points. We present a novel EGO-type constraint-handling method that maintains a so-called tabu list to avoid past points. Our method includes two types of penalties for the key “infill” optimization, which selects the next test runs. We benchmark our tabu EGO algorithm with five alternative approaches, including DIRECT methods using nine test problems and two engineering examples. The engineering examples are based on additive manufacturing process parameter optimization informed using point-based thermal simulations and robust-type quality constraints. Our test problems span unconstrained, simply constrained, and robust constrained problems. The comparative results imply that tabu EGO offers very promising computational performance for all types of black-box optimization in terms of convergence speed and the quality of the final solution.