skip to main content


Title: Strong Metric (Sub)regularity of Karush–Kuhn–Tucker Mappings for Piecewise Linear-Quadratic Convex-Composite Optimization and the Quadratic Convergence of Newton’s Method
This work concerns the local convergence theory of Newton and quasi-Newton methods for convex-composite optimization: where one minimizes an objective that can be written as the composition of a convex function with one that is continuiously differentiable. We focus on the case in which the convex function is a potentially infinite-valued piecewise linear-quadratic function. Such problems include nonlinear programming, mini-max optimization, and estimation of nonlinear dynamics with non-Gaussian noise as well as many modern approaches to large-scale data analysis and machine learning. Our approach embeds the optimality conditions for convex-composite optimization problems into a generalized equation. We establish conditions for strong metric subregularity and strong metric regularity of the corresponding set-valued mappings. This allows us to extend classical convergence of Newton and quasi-Newton methods to the broader class of nonfinite valued piecewise linear-quadratic convex-composite optimization problems. In particular, we establish local quadratic convergence of the Newton method under conditions that parallel those in nonlinear programming.  more » « less
Award ID(s):
1514559
NSF-PAR ID:
10187912
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Mathematics of Operations Research
Volume:
45
Issue:
3
ISSN:
0364-765X
Page Range / eLocation ID:
1164 to 1192
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We propose a randomized algorithm with quadratic convergence rate for convex optimization problems with a self-concordant, composite, strongly convex objective function. Our method is based on performing an approximate Newton step using a random projection of the Hessian. Our first contribution is to show that, at each iteration, the embedding dimension (or sketch size) can be as small as the effective dimension of the Hessian matrix. Leveraging this novel fundamental result, we design an algorithm with a sketch size proportional to the effective dimension and which exhibits a quadratic rate of convergence. This result dramatically improves on the classical linear-quadratic convergence rates of state-of-theart sub-sampled Newton methods. However, in most practical cases, the effective dimension is not known beforehand, and this raises the question of how to pick a sketch size as small as the effective dimension while preserving a quadratic convergence rate. Our second and main contribution is thus to propose an adaptive sketch size algorithm with quadratic convergence rate and which does not require prior knowledge or estimation of the effective dimension: at each iteration, it starts with a small sketch size, and increases it until quadratic progress is achieved. Importantly, we show that the embedding dimension remains proportional to the effective dimension throughout the entire path and that our method achieves state-of-the-art computational complexity for solving convex optimization programs with a strongly convex component. We discuss and illustrate applications to linear and quadratic programming, as well as logistic regression and other generalized linear models. 
    more » « less
  2. Abstract

    This paper concerns the formulation and analysis of a new interior-point method for constrained optimization that combines a shifted primal-dual interior-point method with a projected-search method for bound-constrained optimization. The method involves the computation of an approximate Newton direction for a primal-dual penalty-barrier function that incorporates shifts on both the primal and dual variables. Shifts on the dual variables allow the method to be safely “warm started” from a good approximate solution and avoids the possibility of very large solutions of the associated path-following equations. The approximate Newton direction is used in conjunction with a new projected-search line-search algorithm that employs a flexible non-monotone quasi-Armijo line search for the minimization of each penalty-barrier function. Numerical results are presented for a large set of constrained optimization problems. For comparison purposes, results are also given for two primal-dual interior-point methods that do not use projection. The first is a method that shifts both the primal and dual variables. The second is a method that involves shifts on the primal variables only. The results show that the use of both primal and dual shifts in conjunction with projection gives a method that is more robust and requires significantly fewer iterations. In particular, the number of times that the search direction must be computed is substantially reduced. Results from a set of quadratic programming test problems indicate that the method is particularly well-suited to solving the quadratic programming subproblem in a sequential quadratic programming method for nonlinear optimization.

     
    more » « less
  3. null (Ed.)
    We investigate the approximability of the following optimization problem. The input is an n× n matrix A=(Aij) with real entries and an origin-symmetric convex body K⊂ ℝn that is given by a membership oracle. The task is to compute (or approximate) the maximum of the quadratic form ∑i=1n∑j=1n Aij xixj=⟨ x,Ax⟩ as x ranges over K. This is a rich and expressive family of optimization problems; for different choices of matrices A and convex bodies K it includes a diverse range of optimization problems like max-cut, Grothendieck/non-commutative Grothendieck inequalities, small set expansion and more. While the literature studied these special cases using case-specific reasoning, here we develop a general methodology for treatment of the approximability and inapproximability aspects of these questions. The underlying geometry of K plays a critical role; we show under commonly used complexity assumptions that polytime constant-approximability necessitates that K has type-2 constant that grows slowly with n. However, we show that even when the type-2 constant is bounded, this problem sometimes exhibits strong hardness of approximation. Thus, even within the realm of type-2 bodies, the approximability landscape is nuanced and subtle. However, the link that we establish between optimization and geometry of Banach spaces allows us to devise a generic algorithmic approach to the above problem. We associate to each convex body a new (higher dimensional) auxiliary set that is not convex, but is approximately convex when K has a bounded type-2 constant. If our auxiliary set has an approximate separation oracle, then we design an approximation algorithm for the original quadratic optimization problem, using an approximate version of the ellipsoid method. Even though our hardness result implies that such an oracle does not exist in general, this new question can be solved in specific cases of interest by implementing a range of classical tools from functional analysis, most notably the deep factorization theory of linear operators. Beyond encompassing the scenarios in the literature for which constant-factor approximation algorithms were found, our generic framework implies that that for convex sets with bounded type-2 constant, constant factor approximability is preserved under the following basic operations: (a) Subspaces, (b) Quotients, (c) Minkowski Sums, (d) Complex Interpolation. This yields a rich family of new examples where constant factor approximations are possible, which were beyond the reach of previous methods. We also show (under commonly used complexity assumptions) that for symmetric norms and unitarily invariant matrix norms the type-2 constant nearly characterizes the approximability of quadratic maximization. 
    more » « less
  4. This paper proposes and develops a new Newton-type algorithm to solve subdifferential inclusions defined by subgradients of extended real-valued prox-regular functions. The proposed algorithm is formulated in terms of the second order subdifferential of such functions that enjoy extensive calculus rules and can be efficiently computed for broad classes of extended real-valued functions. Based on this and on the metric regularity and subregularity properties of subgradient mappings, we establish verifiable conditions ensuring the well-posedness of the proposed algorithm and its local superlinear convergence. The obtained results are also new for the class of equations defined by continuously differentiable functions with Lipschitzian gradients ([Formula: see text] functions), which is the underlying case of our consideration. The developed algorithms for prox-regular functions and their extension to a structured class of composite functions are formulated in terms of proximal mappings and forward–backward envelopes. Besides numerous illustrative examples and comparison with known algorithms for [Formula: see text] functions and generalized equations, the paper presents applications of the proposed algorithms to regularized least square problems arising in statistics, machine learning, and related disciplines. Funding: Research of P. D. Khanh is funded by Ho Chi Minh City University of Education Foundation for Science and Technology [Grant CS.2022.19.20TD]. Research of B. Mordukhovich and V. T. Phat was partly supported by the U.S. National Science Foundation [Grants DMS-1808978 and DMS-2204519]. The research of B. Mordukhovich was also supported by the Air Force Office of Scientific Research [Grant 15RT0462] and the Australian Research Council under Discovery Project DP-190100555. This work was supported by the Air Force Office of Scientific Research [Grant 15RT0462]. 
    more » « less
  5. Abstract This paper examines a number of extrapolation and acceleration methods and introduces a few modifications of the standard Shanks transformation that deal with general sequences. One of the goals of the paper is to lay out a general framework that encompasses most of the known acceleration strategies. The paper also considers the Anderson Acceleration (AA) method under a new light and exploits a connection with quasi-Newton methods in order to establish local linear convergence results of a stabilized version of the AA method. The methods are tested on a number of problems, including a few that arise from nonlinear partial differential equations. 
    more » « less