We consider a class of Riemannian optimization problems where the objective is the sum of a smooth function and a nonsmooth function considered in the ambient space. This class of problems finds important applications in machine learning and statistics, such as sparse principal component analysis, sparse spectral clustering, and orthogonal dictionary learning. We propose a Riemannian alternating direction method of multipliers (ADMM) to solve this class of problems. Our algorithm adopts easily computable steps in each iteration. The iteration complexity of the proposed algorithm for obtaining an ϵ-stationary point is analyzed under mild assumptions. Existing ADMMs for solving nonconvex problems either do not allow a nonconvex constraint set or do not allow a nonsmooth objective function. Our algorithm is the first ADMM-type algorithm that minimizes a nonsmooth objective over manifold—a particular nonconvex set. Numerical experiments are conducted to demonstrate the advantage of the proposed method. Funding: The research of S. Ma was supported in part by the Office of Naval Research [Grant N00014-24-1-2705]; the National Science Foundation [Grants DMS-2243650, CCF-2308597, CCF-2311275, and ECCS-2326591]; the University of California, Davis Center for Data Science and Artificial Intelligence Research Innovative Data Science Seed Funding Program; and Rice University start-up fund.
more »
« less
This content will become publicly available on September 1, 2026
Coderivative-based semi-Newton method in nonsmooth difference programming
This paper addresses the study of a new class of nonsmooth optimization prob lems, where the objective is represented as a difference of two generally nonconvex functions. We propose and develop a novel Newton-type algorithm to solving such problems, which is based on the coderivative generated second-order subdifferential (generalized Hessian) and employs advanced tools of variational analysis. Well posedness properties of the proposed algorithm are derived under fairly general requirements, while constructive convergence rates are established by using additional assumptions including the Kurdyka–Łojasiewicz condition. We provide applications of the main algorithm to solving a general class of nonsmooth nonconvex problems of structured optimization that encompasses, in particular, optimization problems with explicit constraints. Finally, applications and numerical experiments are given for solving practical problems that arise in biochemical models, supervised learning, constrained quadratic programming, etc., where advantages of our algorithms are demonstrated in comparison with some known techniques and results.
more »
« less
- Award ID(s):
- 2204519
- PAR ID:
- 10635678
- Publisher / Repository:
- Springer
- Date Published:
- Journal Name:
- Mathematical Programming
- Volume:
- 213
- Issue:
- 1-2
- ISSN:
- 0025-5610
- Page Range / eLocation ID:
- 385 to 432
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Nonconvex and nonsmooth problems have recently attracted considerable attention in machine learning. However, developing efficient methods for the nonconvex and nonsmooth optimization problems with certain performance guarantee remains a challenge. Proximal coordinate descent (PCD) has been widely used for solving optimization problems, but the knowledge of PCD methods in the nonconvex setting is very limited. On the other hand, the asynchronous proximal coordinate descent (APCD) recently have received much attention in order to solve large-scale problems. However, the accelerated variants of APCD algorithms are rarely studied. In this project, we extend APCD method to the accelerated algorithm (AAPCD) for nonsmooth and nonconvex problems that satisfies the sufficient descent property, by comparing between the function values at proximal update and a linear extrapolated point using a delay-aware momentum value. To the best of our knowledge, we are the first to provide stochastic and deterministic accelerated extension of APCD algorithms for general nonconvex and nonsmooth problems ensuring that for both bounded delays and unbounded delays every limit point is a critical point. By leveraging Kurdyka-Łojasiewicz property, we will show linear and sublinear convergence rates for the deterministic AAPCD with bounded delays. Numerical results demonstrate the practical efficiency of our algorithm in speed.more » « less
-
We develop a unified level-bundle method, called accelerated constrained level-bundle (ACLB) algorithm, for solving constrained convex optimization problems. where the objective and constraint functions can be nonsmooth, weakly smooth, and/or smooth. ACLB employs Nesterov’s accelerated gradient technique, and hence retains the iteration complexity as that of existing bundle-type methods if the objective or one of the constraint functions is nonsmooth. More importantly, ACLB can significantly reduce iteration complexity when the objective and all constraints are (weakly) smooth. In addition, if the objective contains a nonsmooth component which can be written as a specific form of maximum, we show that the iteration complexity of this component can be much lower than that for general nonsmooth objective function. Numerical results demonstrate the effectiveness of the proposed algorithm.more » « less
-
We develop a unified level-bundle method, called accelerated constrained level-bundle (ACLB) algorithm, for solving constrained convex optimization problems. where the objective and constraint functions can be nonsmooth, weakly smooth, and/or smooth. ACLB employs Nesterov’s accelerated gradient technique, and hence retains the iteration complexity as that of existing bundle-type methods if the objective or one of the constraint functions is nonsmooth. More importantly, ACLB can significantly reduce iteration complexity when the objective and all constraints are (weakly) smooth. In addition, if the objective contains a nonsmooth component which can be written as a specific form of maximum, we show that the iteration complexity of this component can be much lower than that for general nonsmooth objective function. Numerical results demonstrate the effectiveness of the proposed algorithm.more » « less
-
Stochastic gradient Hamiltonian Monte Carlo (SGHMC) is a variant of stochastic gradients with momentum where a controlled and properly scaled Gaussian noise is added to the stochastic gradients to steer the iterates toward a global minimum. Many works report its empirical success in practice for solving stochastic nonconvex optimization problems; in particular, it has been observed to outperform overdamped Langevin Monte Carlo–based methods, such as stochastic gradient Langevin dynamics (SGLD), in many applications. Although the asymptotic global convergence properties of SGHMC are well known, its finite-time performance is not well understood. In this work, we study two variants of SGHMC based on two alternative discretizations of the underdamped Langevin diffusion. We provide finite-time performance bounds for the global convergence of both SGHMC variants for solving stochastic nonconvex optimization problems with explicit constants. Our results lead to nonasymptotic guarantees for both population and empirical risk minimization problems. For a fixed target accuracy level on a class of nonconvex problems, we obtain complexity bounds for SGHMC that can be tighter than those available for SGLD.more » « less
An official website of the United States government
