Optimization problems are ubiquitous in our societies and are present in almost every segment of the economy. Most of these optimization problems are NP-hard and computationally demanding, often requiring approximate solutions for large-scale instances. Machine learning frameworks that learn to approximate solutions to such hard optimization problems are a potentially promising avenue to address these difficulties, particularly when many closely related problem instances must be solved repeatedly. Supervised learning frameworks can train a model using the outputs of pre-solved instances. However, when the outputs are themselves approximations, when the optimization problem has symmetric solutions, and/or when the solver uses randomization, solutions to closely related instances may exhibit large differences and the learning task can become inherently more difficult. This paper demonstrates this critical challenge, connects the volatility of the training data to the ability of a model to approximate it, and proposes a method for producing (exact or approximate) solutions to optimization problems that are more amenable to supervised learning tasks. The effectiveness of the method is tested on hard non-linear nonconvex and discrete combinatorial problems.
more »
« less
Self-Supervised Primal-Dual Learning for Constrained Optimization
This paper studies how to train machine-learning models that directly approximate the optimal solutions of constrained optimization problems. This is an empirical risk minimization under constraints, which is challenging as training must balance optimality and feasibility conditions. Supervised learning methods often approach this challenge by training the model on a large collection of pre-solved instances. This paper takes a different route and proposes the idea of Primal-Dual Learning (PDL), a self-supervised training method that does not require a set of pre-solved instances or an optimization solver for training and inference. Instead, PDL mimics the trajectory of an Augmented Lagrangian Method (ALM) and jointly trains primal and dual neural networks. Being a primal-dual method, PDL uses instance-specific penalties of the constraint terms in the loss function used to train the primal network. Experiments show that, on a set of nonlinear optimization benchmarks, PDL typically exhibits negligible constraint violations and minor optimality gaps, and is remarkably close to the ALM optimization. PDL also demonstrated improved or similar performance in terms of the optimality gaps, constraint violations, and training times compared to existing approaches.
more »
« less
- Award ID(s):
- 2007095
- PAR ID:
- 10463205
- Date Published:
- Journal Name:
- Proceedings of the AAAI Conference on Artificial Intelligence
- Volume:
- 37
- Issue:
- 4
- ISSN:
- 2159-5399
- Page Range / eLocation ID:
- 4052 to 4060
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
afe reinforcement learning (RL) aims to learn policies that satisfy certain constraints before deploying them to safety-critical applications. Previous primal-dual style approaches suffer from instability issues and lack optimality guarantees. This paper overcomes the issues from the perspective of probabilistic inference. We introduce a novel Expectation-Maximization approach to naturally incorporate constraints during the policy learning: 1) a provable optimal non-parametric variational distribution could be computed in closed form after a convex optimization (E-step); 2) the policy parameter is improved within the trust region based on the optimal variational distribution (M-step). The proposed algorithm decomposes the safe RL problem into a convex optimization phase and a supervised learning phase, which yields a more stable training performance. A wide range of experiments on continuous robotic tasks shows that the proposed method achieves significantly better constraint satisfaction performance and better sample efficiency than baselines. The code is available at https://github.com/liuzuxin/cvpo-safe-rl.more » « less
-
In this work, we consider a distributed online convex optimization problem, with time-varying (potentially adversarial) constraints. A set of nodes, jointly aim to minimize a global objective function, which is the sum of local convex functions. The objective and constraint functions are revealed locally to the nodes, at each time, after taking an action. Naturally, the constraints cannot be instantaneously satisfied. Therefore, we reformulate the problem to satisfy these constraints in the long term. To this end, we propose a distributed primal-dual mirror descent-based algorithm, in which the primal and dual updates are carried out locally at all the nodes. This is followed by sharing and mixing of the primal variables by the local nodes via communication with the immediate neighbors. To quantify the performance of the proposed algorithm, we utilize the challenging, but more realistic metrics of dynamic regret and fit. Dynamic regret measures the cumulative loss incurred by the algorithm compared to the best dynamic strategy, while fit measures the long term cumulative constraint violations. Without assuming the restrictive Slater’s conditions, we show that the proposed algorithm achieves sublinear regret and fit under mild, commonly used assumptions.more » « less
-
We propose a sequential algorithm for learning sparse radial basis approximations for streaming data. The initial phase of the algorithm formulates the RBF training as a convex optimization problem with an objective function on the expansion weights while the data fitting problem imposed only as an ℓ∞-norm constraint. Each new data point observed is tested for feasibility, i.e., whether the data fitting constraint is satisfied. If so, that point is discarded and no model update is required. If it is infeasible, a new basic variable is added to the linear program. The result is a primal infeasible-dual feasible solution. The dual simplex algorithm is applied to determine a new optimal solution. A large fraction of the streaming data points does not require updates to the RBF model since they are similar enough to previously observed data and satisfy the data fitting constraints. The structure of the simplex algorithm makes the update to the solution particularly efficient given the inverse of the new basis matrix is easily computed from the old inverse. The second phase of the algorithm involves a non-convex refinement of the convex problem. Given the sparse nature of the LP solution, the computational expense of the non-convex algorithm is greatly reduced. We have also found that a small subset of the training data that includes the novel data identified by the algorithm can be used to train the non-convex optimization problem with substantial computation savings and comparable errors on the test data. We illustrate the method on the Mackey-Glass chaotic time-series, the monthly sunspot data, and a Fort Collins, Colorado weather data set. In each case we compare the results to artificial neural networks (ANN) and standard skew-RBFs.more » « less
-
Multi-instance learning (MIL) handles data that is organized into sets of instances known as bags. Traditionally, MIL is used in the supervised-learning setting for classifying bags which contain any number of instances. However, many traditional MIL algorithms do not scale efficiently to large datasets. In this paper, we present a novel primal–dual multi-instance support vector machine that can operate efficiently on large-scale data. Our method relies on an algorithm derived using a multi-block variation of the alternating direction method of multipliers. The approach presented in this work is able to scale to large-scale data since it avoids iteratively solving quadratic programming problems which are broadly used to optimize MIL algorithms based on SVMs. In addition, we improve our derivation to include an additional optimization designed to avoid solving a least-squares problem in our algorithm, which increases the utility of our approach to handle a large number of features as well as bags. Finally, we derive a kernel extension of our approach to learn nonlinear decision boundaries for enhanced classification capabilities. We apply our approach to both synthetic and real-world multi-instance datasets to illustrate the scalability, promising predictive performance, and interpretability of our proposed method.more » « less