- Award ID(s):
- 2007164
- PAR ID:
- 10337576
- Date Published:
- Journal Name:
- Advances in Neural Information Processing Systems 34 (NeurIPS 2021)
- Volume:
- 34
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
This paper studies how to train machine-learning models that directly approximate the optimal solutions of constrained optimization problems. This is an empirical risk minimization under constraints, which is challenging as training must balance optimality and feasibility conditions. Supervised learning methods often approach this challenge by training the model on a large collection of pre-solved instances. This paper takes a different route and proposes the idea of Primal-Dual Learning (PDL), a self-supervised training method that does not require a set of pre-solved instances or an optimization solver for training and inference. Instead, PDL mimics the trajectory of an Augmented Lagrangian Method (ALM) and jointly trains primal and dual neural networks. Being a primal-dual method, PDL uses instance-specific penalties of the constraint terms in the loss function used to train the primal network. Experiments show that, on a set of nonlinear optimization benchmarks, PDL typically exhibits negligible constraint violations and minor optimality gaps, and is remarkably close to the ALM optimization. PDL also demonstrated improved or similar performance in terms of the optimality gaps, constraint violations, and training times compared to existing approaches.more » « less
-
Abstract Quantum Approximate Optimization algorithm (QAOA) aims to search for approximate solutions to discrete optimization problems with near-term quantum computers. As there are no algorithmic guarantee possible for QAOA to outperform classical computers, without a proof that bounded-error quantum polynomial time (BQP) ≠ nondeterministic polynomial time (NP), it is necessary to investigate the empirical advantages of QAOA. We identify a computational phase transition of QAOA when solving hard problems such as SAT—random instances are most difficult to train at a critical problem density. We connect the transition to the controllability and the complexity of QAOA circuits. Moreover, we find that the critical problem density in general deviates from the SAT-UNSAT phase transition, where the hardest instances for classical algorithms lies. Then, we show that the high problem density region, which limits QAOA’s performance in hard optimization problems (reachability deficits), is actually a good place to utilize QAOA: its approximation ratio has a much slower decay with the problem density, compared to classical approximate algorithms. Indeed, it is exactly in this region that quantum advantages of QAOA over classical approximate algorithms can be identified.
-
We study robust convex quadratic programs where the uncertain problem parameters can contain both continuous and integer components. Under the natural boundedness assumption on the uncertainty set, we show that the generic problems are amenable to exact copositive programming reformulations of polynomial size. These convex optimization problems are NP-hard but admit a conservative semidefinite programming (SDP) approximation that can be solved efficiently. We prove that the popular approximate S-lemma method—which is valid only in the case of continuous uncertainty—is weaker than our approximation. We also show that all results can be extended to the two-stage robust quadratic optimization setting if the problem has complete recourse. We assess the effectiveness of our proposed SDP reformulations and demonstrate their superiority over the state-of-the-art solution schemes on instances of least squares, project management, and multi-item newsvendor problems.more » « less
-
Finding Nash equilibrial policies for two-player differential games requires solving Hamilton-Jacobi-Isaacs (HJI) PDEs. Self-supervised learning has been used to approximate solutions of such PDEs while circumventing the curse of dimensionality. However, this method fails to learn discontinuous PDE solutions due to its sampling nature, leading to poor safety performance of the resulting controllers in robotics applications when player rewards are discontinuous. This paper investigates two potential solutions to this problem: a hybrid method that leverages both supervised Nash equilibria and the HJI PDE, and a value-hardening method where a sequence of HJIs are solved with a gradually hardening reward. We compare these solutions using the resulting generalization and safety performance in two vehicle interaction simulation studies with 5D and 9D state spaces, respectively. Results show that with informative supervision (e.g., collision and near-collision demonstrations) and the low cost of self-supervised learning, the hybrid method achieves better safety performance than the supervised, self-supervised, and value hardening approaches on equal computational budget. Value hardening fails to generalize in the higher-dimensional case without informative supervision. Lastly, we show that the neural activation function needs to be continuously differentiable for learning PDEs and its choice can be case dependent.more » « less
-
Abstract We present a non-anticipative learning- and scenario-based prediction-optimization (ScenPredOpt) framework that combines deep learning, heuristics, and mathematical solvers for solving combinatorial problems under uncertainty. Specifically, we transform neural machine translation frameworks to predict the optimal solutions of scenario-based multi-stage stochastic programs. The learning models are trained efficiently using the input and solution data of the multi-stage single-scenario deterministic problems. Then our ScenPredOpt framework creates a mapping from the inputs used in training into an output of predictions that are close to optimal solutions. We present a Non-anticipative Encoder-Decoder with Attention (NEDA) approach, which ensures the non-anticipativity property of multi-stage stochastic programs and, thus, time consistency by calibrating the learned information based on the problem’s scenario tree and adjusting the hidden states of the neural network. In our ScenPredOpt framework, the percent predicted variables used for the solution are iteratively reduced through a relaxation of the problem to eliminate infeasibility. Then, a linear relaxation-based heuristic is performed to further reduce the solution time. Finally, a mathematical solver is used to generate the complete solution. We present the results on two NP-Hard sequential optimization problems under uncertainty: stochastic multi-item capacitated lot-sizing and stochastic multistage multidimensional knapsack. The results show that the solution time can be reduced by a factor of 599 with an optimality gap of only 0.08%. We compare the results of the ScenPredOpt framework with cutting-edge exact and heuristic solution algorithms for the problems studied and find that our framework is more effective. Additionally, the computational results demonstrate that ScenPredOpt can solve instances with a larger number of items and scenarios than the trained ones. Our non-anticipative learning-optimization approach can be beneficial for stochastic programming problems involving binary variables that are solved repeatedly with various types of dimensions and similar decisions at each period.