This paper addresses the study of a new class of nonsmooth optimization prob lems, where the objective is represented as a difference of two generally nonconvex functions. We propose and develop a novel Newton-type algorithm to solving such problems, which is based on the coderivative generated second-order subdifferential (generalized Hessian) and employs advanced tools of variational analysis. Well posedness properties of the proposed algorithm are derived under fairly general requirements, while constructive convergence rates are established by using additional assumptions including the Kurdyka–Łojasiewicz condition. We provide applications of the main algorithm to solving a general class of nonsmooth nonconvex problems of structured optimization that encompasses, in particular, optimization problems with explicit constraints. Finally, applications and numerical experiments are given for solving practical problems that arise in biochemical models, supervised learning, constrained quadratic programming, etc., where advantages of our algorithms are demonstrated in comparison with some known techniques and results.
more »
« less
Ghost Penalties in Nonconvex Constrained Optimization: Diminishing Stepsizes and Iteration Complexity
We consider nonconvex constrained optimization problems and propose a new approach to the convergence analysis based on penalty functions. We make use of classical penalty functions in an unconventional way, in that penalty functions only enter in the theoretical analysis of convergence while the algorithm itself is penalty free. Based on this idea, we are able to establish several new results, including the first general analysis for diminishing stepsize methods in nonconvex, constrained optimization, showing con- vergence to generalized stationary points, and a complexity study for sequential quadratic programming–type algorithms.
more »
« less
- PAR ID:
- 10309251
- Date Published:
- Journal Name:
- Mathematics of operations research
- Volume:
- 46
- Issue:
- 2
- ISSN:
- 0364-765X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract We present a new feasible proximal gradient method for constrained optimization where both the objective and constraint functions are given by summation of a smooth, possibly nonconvex function and a convex simple function. The algorithm converts the original problem into a sequence of convex subproblems. Formulating those subproblems requires the evaluation of at most one gradient-value of the original objective and constraint functions. Either exact or approximate subproblems solutions can be computed efficiently in many cases. An important feature of the algorithm is the constraint level parameter. By carefully increasing this level for each subproblem, we provide a simple solution to overcome the challenge of bounding the Lagrangian multipliers and show that the algorithm follows a strictly feasible solution path till convergence to the stationary point. We develop a simple, proximal gradient descent type analysis, showing that the complexity bound of this new algorithm is comparable to gradient descent for the unconstrained setting which is new in the literature. Exploiting this new design and analysis technique, we extend our algorithms to some more challenging constrained optimization problems where (1) the objective is a stochastic or finite-sum function, and (2) structured nonsmooth functions replace smooth components of both objective and constraint functions. Complexity results for these problems also seem to be new in the literature. Finally, our method can also be applied to convex function constrained problems where we show complexities similar to the proximal gradient method.more » « less
-
We consider the problem of subspace clustering with data that is potentially corrupted by both dense noise and sparse gross errors. In particular, we study a recently proposed low rank subspace clustering approach based on a nonconvex modeling formulation. This formulation includes a nonconvex spectral function in the objective function that makes the optimization task challenging, e.g., it is unknown whether the alternating direction method of multipliers (ADMM) framework proposed to solve the nonconvex model formulation is provably convergent. In this paper, we establish that the spectral function is differentiable and give a formula for computing the derivative. Moreover, we show that the derivative of the spectral function is Lipschitz continuous and provide an explicit value for the Lipschitz constant. These facts are then used to provide a lower bound for how the penalty parameter in the ADMM method should be chosen. As long as the penalty parameter is chosen according to this bound, we show that the ADMM algorithm computes iterates that have a limit point satisfying first-order optimality conditions. We also present a second strategy for solving the nonconvex problem that is based on proximal gradient calculations. The convergence and performance of the algorithms is verified through experiments on real data from face and digit clustering and motion segmentation.more » « less
-
Abstract We study stochastic projection-free methods for constrained optimization of smooth functions on Riemannian manifolds, i.e., with additional constraints beyond the parameter domain being a manifold. Specifically, we introduce stochastic Riemannian Frank–Wolfe (Fw) methods for nonconvex and geodesically convex problems. We present algorithms for both purely stochastic optimization and finite-sum problems. For the latter, we develop variance-reduced methods, including a Riemannian adaptation of the recently proposed Spider technique. For all settings, we recover convergence rates that are comparable to the best-known rates for their Euclidean counterparts. Finally, we discuss applications to two classic tasks: the computation of the Karcher mean of positive definite matrices and Wasserstein barycenters for multivariate normal distributions. For both tasks, stochastic Fw methods yield state-of-the-art empirical performance.more » « less
-
In this work, we study first-order algorithms for solving Bilevel Optimization (BO) where the objective functions are smooth but possibly nonconvex in both levels and the variables are restricted to closed convex sets. As a first step, we study the landscape of BO through the lens of penalty methods, in which the upper- and lower-level objectives are combined in a weighted sum with penalty parameter . In particular, we establish a strong connection between the penalty function and the hyper-objective by explicitly characterizing the conditions under which the values and derivatives of the two must be -close. A by-product of our analysis is the explicit formula for the gradient of hyper-objective when the lower-level problem has multiple solutions under minimal conditions, which could be of independent interest. Next, viewing the penalty formulation as -approximation of the original BO, we propose first-order algorithms that find an -stationary solution by optimizing the penalty formulation with . When the perturbed lower-level problem uniformly satisfies the {\it small-error} proximal error-bound (EB) condition, we propose a first-order algorithm that converges to an -stationary point of the penalty function using in total accesses to first-order stochastic gradient oracles. Under an additional assumption on stochastic oracles, we show that the algorithm can be implemented in a fully {\it single-loop} manner, {\it i.e.,} with samples per iteration, and achieves the improved oracle-complexity of .more » « less
An official website of the United States government

