skip to main content


This content will become publicly available on August 1, 2024

Title: Optimization with Neural Network Feasibility Surrogates: Formulations and Application to Security-Constrained Optimal Power Flow

In many areas of constrained optimization, representing all possible constraints that give rise to an accurate feasible region can be difficult and computationally prohibitive for online use. Satisfying feasibility constraints becomes more challenging in high-dimensional, non-convex regimes which are common in engineering applications. A prominent example that is explored in the manuscript is the security-constrained optimal power flow (SCOPF) problem, which minimizes power generation costs, while enforcing system feasibility under contingency failures in the transmission network. In its full form, this problem has been modeled as a nonlinear two-stage stochastic programming problem. In this work, we propose a hybrid structure that incorporates and takes advantage of both a high-fidelity physical model and fast machine learning surrogates. Neural network (NN) models have been shown to classify highly non-linear functions and can be trained offline but require large training sets. In this work, we present how model-guided sampling can efficiently create datasets that are highly informative to a NN classifier for non-convex functions. We show how the resultant NN surrogates can be integrated into a non-linear program as smooth, continuous functions to simultaneously optimize the objective function and enforce feasibility using existing non-linear solvers. Overall, this allows us to optimize instances of the SCOPF problem with an order of magnitude CPU improvement over existing methods.

 
more » « less
Award ID(s):
1944678
NSF-PAR ID:
10489335
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
MDPI
Date Published:
Journal Name:
Energies
Volume:
16
Issue:
16
ISSN:
1996-1073
Page Range / eLocation ID:
5913
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. A novel machine learning model is presented in this work to obtain the complex high-dimensional deformation of Multi-Walled Carbon Nanotubes (MWCNTs) containing millions of atoms. To obtain the deformation of these high dimensional systems, existing models like Atomistic, Continuum or Atomistic-Continuum models are very accurate and reliable but are computationally prohibitive for these large systems. This high computational requirement slows down the exploration of physics of these materials. To alleviate this problem, we developed a machine learning model that contains a) a novel dimensionality reduction technique which is combined with b) deep neural network based learning in the reduced dimension. The proposed non-linear dimensionality reduction technique serves as an extension of functional principal component analysis. This extension ensures that the geometric constraints of deformation are satisfied exactly and hence we termed this extension as constrained functional principal component analysis. The novelty of this technique is its ability to design a function space where all the functions satisfy the constraints exactly, not approximately. The efficient dimensionality reduction along with the exact satisfaction of the constraint bolster the deep neural network to achieve remarkable accuracy. The proposed model predicts the deformation of MWCNTs very accurately when compared with the deformation obtained through atomistic-physics-based model. To simulate the complex high-dimensional deformation the atomistic-physics-based models takes weeks high performance computing facility, whereas the proposed machine learning model can predict the deformation in seconds. This technique also extracts the universally dominant pattern of deformation in an unsupervised manner. These patterns are comprehensible to us and provides us a better explanation on the working of the model. The comprehensibility of the dominant modes of deformation yields the interpretability of the model.

     
    more » « less
  2. d. Many of the infrastructure sectors that are considered to be crucial by the Department of Homeland Security include networked systems (physical and temporal) that function to move some commodity like electricity, people, or even communication from one location of importance to another. The costs associated with these flows make up the price of the network’s normal functionality. These networks have limited capacities, which cause the marginal cost of a unit of flow across an edge to increase as congestion builds. In order to limit the expense of a network’s normal demand we aim to increase the resilience of the system and specifically the resilience of the arc capacities. Divisions of critical infrastructure have faced difficulties in recent years as inadequate resources have been available for needed upgrades and repairs. Without being able to determine future factors that cause damage both minor and extreme to the networks, officials must decide how to best allocate the limited funds now so that these essential systems can withstand the heavy weight of society’s reliance. We model these resource allocation decisions using a two-stage stochastic program (SP) for the purpose of network protection. Starting with a general form for a basic two-stage SP, we enforce assumptions that specify characteristics key to this type of decision model. The second stage objective—which represents the price of the network’s routine functionality—is nonlinear, as it reflects the increasing marginal cost per unit of additional flow across an arc. After the model has been designed properly to reflect the network protection problem, we are left with a nonconvex, nonlinear, nonseparable risk-neutral program. This research focuses on key reformulation techniques that transform the problematic model into one that is convex, separable, and much more solvable. Our approach focuses on using perspective functions to convexify the feasibility set of the second stage and second order conic constraints to represent nonlinear constraints in a form that better allows the use of computational solvers. Once these methods have been applied to the risk-neutral model we introduce a risk measure into the first stage that allows us to control the balance between an efficient, solvable model and the need to hedge against extreme events. Using Benders cuts that exploit linear separability, we give a decomposition and solution algorithm for the general network model. The innovations included in this formulation are then implemented on a transportation network with given flow demand 
    more » « less
  3. null (Ed.)
    Under the linear regression framework, we study the variable selection problem when the underlying model is assumed to have a small number of nonzero coefficients. Non-convex penalties in speci c forms are well-studied in the literature for sparse estimation. A recent work, Ahn, Pang, and Xin (2017), has pointed out that nearly all existing non-convex penalties can be represented as difference-of-convex (DC) functions, which are the difference of two convex functions, while itself may not be convex. There is a large existing literature on optimization problems when their objectives and/or constraints involve DC functions. Efficient numerical solutions have been proposed. Under the DC framework, directional-stationary (d-stationary) solutions are considered, and they are usually not unique. In this paper, we show that under some mild conditions, a certain subset of d-stationary solutions in an optimization problem (with a DC objective) has some ideal statistical properties: namely, asymptotic estimation consistency, asymptotic model selection consistency, asymptotic efficiency. Our assumptions are either weaker than or comparable with those conditions that have been adopted in other existing works. This work shows that DC is a nice framework to offer a uni ed approach to these existing works where non-convex penalties are involved. Our work bridges the communities of optimization and statistics. 
    more » « less
  4. null ; null ; null ; null ; null (Ed.)
    We consider an online optimization problem in which the reward functions are DR-submodular, and in addition to maximizing the total reward, the sequence of decisions must satisfy some convex constraints on average. Specifically, at each round t, upon committing to an action x_t, a DR-submodular utility function f_t and a convex constraint function g_t are revealed, and the goal is to maximize the overall utility while ensuring the average of the constraint functions is non-positive (so constraints are satisfied on average). Such cumulative constraints arise naturally in applications where the average resource consumption is required to remain below a specified threshold. We study this problem under an adversarial model and a stochastic model for the convex constraints, where the functions g_t can vary arbitrarily or according to an i.i.d. process over time. We propose a single algorithm which achieves sub-linear regret (with respect to the time horizon T) as well as sub-linear constraint violation bounds in both settings, without prior knowledge of the regime. Prior works have studied this problem in the special case of linear constraint functions. Our results not only improve upon the existing bounds under linear cumulative constraints, but also give the first sub-linear bounds for general convex long-term constraints. 
    more » « less
  5. Many real-world analytics problems involve two significant challenges: prediction and optimization. Because of the typically complex nature of each challenge, the standard paradigm is predict-then-optimize. By and large, machine learning tools are intended to minimize prediction error and do not account for how the predictions will be used in the downstream optimization problem. In contrast, we propose a new and very general framework, called Smart “Predict, then Optimize” (SPO), which directly leverages the optimization problem structure—that is, its objective and constraints—for designing better prediction models. A key component of our framework is the SPO loss function, which measures the decision error induced by a prediction. Training a prediction model with respect to the SPO loss is computationally challenging, and, thus, we derive, using duality theory, a convex surrogate loss function, which we call the SPO+ loss. Most importantly, we prove that the SPO+ loss is statistically consistent with respect to the SPO loss under mild conditions. Our SPO+ loss function can tractably handle any polyhedral, convex, or even mixed-integer optimization problem with a linear objective. Numerical experiments on shortest-path and portfolio-optimization problems show that the SPO framework can lead to significant improvement under the predict-then-optimize paradigm, in particular, when the prediction model being trained is misspecified. We find that linear models trained using SPO+ loss tend to dominate random-forest algorithms, even when the ground truth is highly nonlinear. This paper was accepted by Yinyu Ye, optimization. Supplemental Material: Data and the online appendix are available at https://doi.org/10.1287/mnsc.2020.3922 
    more » « less