This content will become publicly available on July 10, 2025
 Award ID(s):
 2303158
 NSFPAR ID:
 10537255
 Publisher / Repository:
 IEEE
 Date Published:
 Format(s):
 Medium: X
 Location:
 Toronto, Canada
 Sponsoring Org:
 National Science Foundation
More Like this

In this work, we consider twostage quadratic optimization problems under ellipsoidal uncertainty. In the first stage, one needs to decide upon the values of a subset of optimization variables (control variables). In the second stage, the uncertainty is revealed, and the rest of the optimization variables (state variables) are set up as a solution to a known system of possibly nonlinear equations. This type of problem occurs, for instance, in optimization for dynamical systems, such as electric power systems as well as gas and water networks. We propose a convergent iterative algorithm to build a sequence of approximately robustly feasible solutions with an improving objective value. At each iteration, the algorithm optimizes over a subset of the feasible set and uses affine approximations of the secondstage equations while preserving the nonlinearity of other constraints. We implement our approach and demonstrate its performance on Matpower instances of AC optimal power flow. Although this paper focuses on quadratic problems, the approach is suitable for more general setups.more » « less

Ruiz, Francisco ; Dy, Jennifer ; van de Meent, JanWillem (Ed.)The softmax function is a ubiquitous component at the output of neural networks and increasingly in intermediate layers as well. This paper provides convex lower bounds and concave upper bounds on the softmax function, which are compatible with convex optimization formulations for characterizing neural networks and other ML models. We derive bounds using both a natural exponentialreciprocal decomposition of the softmax as well as an alternative decomposition in terms of the logsumexp function. The new bounds are provably and/or numerically tighter than linear bounds obtained in previous work on robustness verification of transformers. As illustrations of the utility of the bounds, we apply them to verification of transformers as well as of the robustness of predictive uncertainty estimates of deep ensembles.more » « less

A selfexcited system is a nonlinear system with the property that a constant input yields a bounded, nonconvergent response. Nonlinear identification of selfexcited systems is considered using a Lur'e model structure, where a linear model is connected in feedback with a nonlinear feedback function. To facilitate identification, the nonlinear feedback function is assumed to be continuous and piecewise affine (CPA). The present paper uses leastsquares optimization to estimate the coefficients of the linear dynamics and the slope vector of the CPA nonlinearity, as well as mixedinteger optimization to estimate the order of the linear dynamics and the breakpoints of the CPA function. The proposed identification technique requires only output data, and thus no measurement of the constant input is required. This technique is illustrated on a diverse collection of lowdimensional numerical examples as well as data from a gasturbine combustor.more » « less

Our goal is to learn control policies for robots that provably generalize well to novel environments given a dataset of example environments. The key technical idea behind our approach is to leverage tools from generalization theory in machine learning by exploiting a precise analogy (which we present in the form of a reduction) between generalization of control policies to novel environments and generalization of hypotheses in the supervised learning setting. In particular, we utilize the probably approximately correct (PAC)Bayes framework, which allows us to obtain upper bounds that hold with high probability on the expected cost of (stochastic) control policies across novel environments. We propose policy learning algorithms that explicitly seek to minimize this upper bound. The corresponding optimization problem can be solved using convex optimization (relative entropy programming in particular) in the setting where we are optimizing over a finite policy space. In the more general setting of continuously parameterized policies (e.g., neural network policies), we minimize this upper bound using stochastic gradient descent. We present simulated results of our approach applied to learning (1) reactive obstacle avoidance policies and (2) neural networkbased grasping policies. We also present hardware results for the Parrot Swing drone navigating through different obstacle environments. Our examples demonstrate the potential of our approach to provide strong generalization guarantees for robotic systems with continuous state and action spaces, complicated (e.g., nonlinear) dynamics, rich sensory inputs (e.g., depth images), and neural networkbased policies.

We prove two new results about the inability of lowdegree polynomials to uniformly approximate constantdepth circuits, even to slightlybetterthantrivial error. First, we prove a tight Omega~(n^{1/2}) lower bound on the threshold degree of the SURJECTIVITY function on n variables. This matches the best known threshold degree bound for any AC^0 function, previously exhibited by a much more complicated circuit of larger depth (Sherstov, FOCS 2015). Our result also extends to a 2^{Omega~(n^{1/2})} lower bound on the signrank of an AC^0 function, improving on the previous best bound of 2^{Omega(n^{2/5})} (Bun and Thaler, ICALP 2016). Second, for any delta>0, we exhibit a function f : {1,1}^n > {1,1} that is computed by a circuit of depth O(1/delta) and is hard to approximate by polynomials in the following sense: f cannot be uniformly approximated to error epsilon=12^{Omega(n^{1delta})}, even by polynomials of degree n^{1delta}. Our recent prior work (Bun and Thaler, FOCS 2017) proved a similar lower bound, but which held only for error epsilon=1/3. Our result implies 2^{Omega(n^{1delta})} lower bounds on the complexity of AC^0 under a variety of basic measures such as discrepancy, margin complexity, and threshold weight. This nearly matches the trivial upper bound of 2^{O(n)} that holds for every function. The previous best lower bound on AC^0 for these measures was 2^{Omega(n^{1/2})} (Sherstov, FOCS 2015). Additional applications in learning theory, communication complexity, and cryptography are described.more » « less