Many convex optimization problems can be represented through conic extended formulations (EFs) using only the small number of standard cones recognized by advanced conic solvers such as MOSEK 9. However, EFs are often significantly larger and more complex than equivalent conic natural formulations (NFs) represented using the much broader class of exotic cones. We define an exotic cone as a proper cone for which we can implement easily computable logarithmically homogeneous self-concordant barrier oracles for either the cone or its dual cone. Our goal is to establish whether a generic conic interior point solver supporting NFs can outperform an advanced conic solver specialized for EFs across a variety of applied problems. We introduce Hypatia, a highly configurable open-source conic primal-dual interior point solver written in Julia and accessible through JuMP. Hypatia has a generic interface for exotic cones, some of which we define here. For seven applied problems, we introduce NFs using these cones and construct EFs that are necessarily larger and more complex. Our computational experiments demonstrate the advantages, especially in terms of solve time and memory usage, of solving the NFs with Hypatia compared with solving the EFs with either Hypatia or MOSEK 9.
more »
« less
This content will become publicly available on December 15, 2025
Inexact Augmented Lagrangian Methods for Conic Optimization: Quadratic Growth and Linear Convergence
Augmented Lagrangian Methods (ALMs) are widely employed in solving constrained optimizations, and some efficient solvers are developed based on this framework. Under the quadratic growth assumption, it is known that the dual iterates and the Karush–Kuhn–Tucker (KKT) residuals of ALMs applied to conic programs converge linearly. In contrast, the convergence rate of the primal iterates has remained elusive. In this paper, we resolve this challenge by establishing new quadratic growth and error bound properties for primal and dual conic programs under the standard strict complementarity condition. Our main results reveal that both primal and dual iterates of the ALMs converge linearly contingent solely upon the assumption of strict complementarity and a bounded solution set. This finding provides a positive answer to an open question regarding the asymptotically linear convergence of the primal iterates of ALMs applied to conic optimization.
more »
« less
- PAR ID:
- 10631538
- Editor(s):
- Globerson, A; Mackey, L; Belgrave, D; Fan, A; Paquet, U; Tomczak, J; Zhang, C
- Publisher / Repository:
- Curran Associates, Inc.
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In machine learning, stochastic gradient descent (SGD) is widely deployed to train models using highly non-convex objectives with equally complex noise models. Unfortunately, SGD theory often makes restrictive assumptions that fail to capture the non-convexity of real problems, and almost entirely ignore the complex noise models that exist in practice. In this work, we demonstrate the restrictiveness of these assumptions using three canonical models in machine learning. Then, we develop novel theory to address this shortcoming in two ways. First, we establish that SGD’s iterates will either globally converge to a stationary point or diverge under nearly arbitrary nonconvexity and noise models. Under a slightly more restrictive assumption on the joint behavior of the non-convexity and noise model that generalizes current assumptions in the literature, we show that the objective function cannot diverge, even if the iterates diverge. As a consequence of our results, SGD can be applied to a greater range of stochastic optimization problems with confidence about its global convergence behavior and stability.more » « less
-
Recent research has observed that in machine learning optimization, gradient descent (GD) often operates at the edge of stability (EoS) [Cohen, et al., 2021], where the stepsizes are set to be large, resulting in non-monotonic losses induced by the GD iterates. This paper studies the convergence and implicit bias of constant-stepsize GD for logistic regression on linearly separable data in the EoS regime. Despite the presence of local oscillations, we prove that the logistic loss can be minimized by GD with \emph{any} constant stepsize over a long time scale. Furthermore, we prove that with \emph{any} constant stepsize, the GD iterates tend to infinity when projected to a max-margin direction (the hard-margin SVM direction) and converge to a fixed vector that minimizes a strongly convex potential when projected to the orthogonal complement of the max-margin direction. In contrast, we also show that in the EoS regime, GD iterates may diverge catastrophically under the exponential loss, highlighting the superiority of the logistic loss. These theoretical findings are in line with numerical simulations and complement existing theories on the convergence and implicit bias of GD for logistic regressiomore » « less
-
We propose a new primal-dual homotopy smoothing algorithm for a linearly constrained convex program, where neither the primal nor the dual function has to be smooth or strongly convex. The best known iteration complexity solving such a non-smooth problem is O(ε−1). In this paper, we show that by leveraging a local error bound condition on the dual function, the proposed algorithm can achieve a better primal convergence time of O ε−2/(2+β) log2(ε−1), where β ∈ (0, 1] is a local error bound parameter. As an example application of the general algorithm, we show that the distributed geometric median problem, which can be formulated as a constrained convex program, has its dual function non-smooth but satisfying the aforementioned local error bound condition with β = 1/2, therefore enjoying a convergence time of O ε−4/5 log2(ε−1). This result improves upon the O(ε−1) convergence time bound achieved by existing distributed optimization algorithms. Simulation experiments also demonstrate the performance of our proposed algorithm.more » « less
-
null (Ed.)In this paper we study the smooth convex-concave saddle point problem. Specifically, we analyze the last iterate convergence properties of the Extragradient (EG) algorithm. It is well known that the ergodic (averaged) iterates of EG converge at a rate of O(1/T) (Nemirovski, 2004). In this paper, we show that the last iterate of EG converges at a rate of O(1/T‾‾√). To the best of our knowledge, this is the first paper to provide a convergence rate guarantee for the last iterate of EG for the smooth convex-concave saddle point problem. Moreover, we show that this rate is tight by proving a lower bound of Ω(1/T‾‾√) for the last iterate. This lower bound therefore shows a quadratic separation of the convergence rates of ergodic and last iterates in smooth convex-concave saddle point problems.more » « less
An official website of the United States government
