skip to main content


Title: Data‐driven decision‐focused surrogate modeling
Abstract

We introduce the concept of decision‐focused surrogate modeling for solving computationally challenging nonlinear optimization problems in real‐time settings. The proposed data‐driven framework seeks to learn a simpler, for example, convex, surrogate optimization model that is trained to minimize thedecision prediction error, which is defined as the difference between the optimal solutions of the original and the surrogate optimization models. The learning problem, formulated as a bilevel program, can be viewed as a data‐driven inverse optimization problem to which we apply a decomposition‐based solution algorithm from previous work. We validate our framework through numerical experiments involving the optimization of common nonlinear chemical processes such as chemical reactors, heat exchanger networks, and material blending systems. We also present a detailed comparison of decision‐focused surrogate modeling with standard data‐driven surrogate modeling methods and demonstrate that our approach is significantly more data‐efficient while producing simple surrogate models with high decision prediction accuracy.

 
more » « less
Award ID(s):
2044077
NSF-PAR ID:
10485280
Author(s) / Creator(s):
 ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
AIChE Journal
Volume:
70
Issue:
4
ISSN:
0001-1541
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Many real-world analytics problems involve two significant challenges: prediction and optimization. Because of the typically complex nature of each challenge, the standard paradigm is predict-then-optimize. By and large, machine learning tools are intended to minimize prediction error and do not account for how the predictions will be used in the downstream optimization problem. In contrast, we propose a new and very general framework, called Smart “Predict, then Optimize” (SPO), which directly leverages the optimization problem structure—that is, its objective and constraints—for designing better prediction models. A key component of our framework is the SPO loss function, which measures the decision error induced by a prediction. Training a prediction model with respect to the SPO loss is computationally challenging, and, thus, we derive, using duality theory, a convex surrogate loss function, which we call the SPO+ loss. Most importantly, we prove that the SPO+ loss is statistically consistent with respect to the SPO loss under mild conditions. Our SPO+ loss function can tractably handle any polyhedral, convex, or even mixed-integer optimization problem with a linear objective. Numerical experiments on shortest-path and portfolio-optimization problems show that the SPO framework can lead to significant improvement under the predict-then-optimize paradigm, in particular, when the prediction model being trained is misspecified. We find that linear models trained using SPO+ loss tend to dominate random-forest algorithms, even when the ground truth is highly nonlinear. This paper was accepted by Yinyu Ye, optimization. Supplemental Material: Data and the online appendix are available at https://doi.org/10.1287/mnsc.2020.3922 
    more » « less
  2. Abstract

    Having the ability to analyze, simulate, and optimize complex systems is becoming more important in all engineering disciplines. Decision‐making using complex systems usually leads to nonlinear optimization problems, which rely on computationally expensive simulations. Therefore, it is often challenging to detect the actual structure of the optimization problem and formulate these problems with closed‐form analytical expressions. Surrogate‐based optimization of complex systems is a promising approach that is based on the concept of adaptively fitting and optimizing approximations of the input–output data. Standard surrogate‐based optimization assumes the degrees of freedom are known a priori; however, in real applications the sparsity and the actual structure of the black‐box formulation may not be known. In this work, we propose to select the correct variables contributing to each objective function and constraints of the black‐box problem, by formulating the identification of the true sparsity of the formulation as a nonlinear feature selection problem. We compare three variable selection criteria based on Support Vector Regression and develop efficient algorithms to detect the sparsity of black‐box formulations when only a limited amount of deterministic or noisy data is available.

     
    more » « less
  3. This article addresses the operational optimization of industrial steam systems under device efficiency uncertainty using a data‐driven adaptive robust optimization approach. A semiempirical model of steam turbine is first developed based on process mechanism and operational data. Uncertain parameters of the proposed steam turbine model are further derived from the historical process data. A robust kernel density estimation method is then used to construct the uncertainty sets for modeling these uncertain parameters. The data‐driven uncertainty sets are incorporated into a two‐stage adaptive robust mixed‐integer linear programming (MILP) framework for operational optimization of steam systems to minimize the total operating cost. Integer variables are introduced to model the on/off decisions of the steam turbines and electrical motors, which are the major energy consumers of the steam system. By applying the affine decision rule, the proposed multilevel optimization model is transformed into its robust counterpart, which is a single‐level MILP problem. The proposed framework is applied to the steam system of a real‐world ethylene plant to demonstrate its applicability. © 2018 American Institute of Chemical EngineersAIChE J, 65: e16500 2019

     
    more » « less
  4. Many sequential decision making tasks can be viewed as combinatorial optimiza- tion problems over a large number of actions. When the cost of evaluating an ac- tion is high, even a greedy algorithm, which iteratively picks the best action given the history, is prohibitive to run. In this paper, we aim to learn a greedy heuris- tic for sequentially selecting actions as a surrogate for invoking the expensive oracle when evaluating an action. In particular, we focus on a class of combinato- rial problems that can be solved via submodular maximization (either directly on the objective function or via submodular surrogates). We introduce a data-driven optimization framework based on the submodular-norm loss, a novel loss func- tion that encourages the resulting objective to exhibit diminishing returns. Our framework outputs a surrogate objective that is efficient to train, approximately submodular, and can be made permutation-invariant. The latter two properties al- low us to prove strong approximation guarantees for the learned greedy heuristic. Furthermore, our model is easily integrated with modern deep imitation learning pipelines for sequential prediction tasks. We demonstrate the performance of our algorithm on a variety of batched and sequential optimization tasks, including set cover, active learning, and data-driven protein engineering. 
    more » « less
  5. This article aims to leverage the big data in shale gas industry for better decision making in optimal design and operations of shale gas supply chains under uncertainty. We propose a two‐stage distributionally robust optimization model, where uncertainties associated with both the upstream shale well estimated ultimate recovery and downstream market demand are simultaneously considered. In this model, decisions are classified into first‐stage design decisions, which are related to drilling schedule, pipeline installment, and processing plant construction, as well as second‐stage operational decisions associated with shale gas production, processing, transportation, and distribution. A data‐driven approach is applied to construct the ambiguity set based on principal component analysis and first‐order deviation functions. By taking advantage of affine decision rules, a tractable mixed‐integer linear programming formulation can be obtained. The applicability of the proposed modeling framework is demonstrated through a small‐scale illustrative example and a case study of Marcellus shale gas supply chain. Comparisons with alternative optimization models, including the deterministic and stochastic programming counterparts, are investigated as well. © 2018 American Institute of Chemical EngineersAIChE J, 65: 947–963, 2019

     
    more » « less