skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on December 20, 2025

Title: A Riemannian Alternating Direction Method of Multipliers
We consider a class of Riemannian optimization problems where the objective is the sum of a smooth function and a nonsmooth function considered in the ambient space. This class of problems finds important applications in machine learning and statistics, such as sparse principal component analysis, sparse spectral clustering, and orthogonal dictionary learning. We propose a Riemannian alternating direction method of multipliers (ADMM) to solve this class of problems. Our algorithm adopts easily computable steps in each iteration. The iteration complexity of the proposed algorithm for obtaining an ϵ-stationary point is analyzed under mild assumptions. Existing ADMMs for solving nonconvex problems either do not allow a nonconvex constraint set or do not allow a nonsmooth objective function. Our algorithm is the first ADMM-type algorithm that minimizes a nonsmooth objective over manifold—a particular nonconvex set. Numerical experiments are conducted to demonstrate the advantage of the proposed method. Funding: The research of S. Ma was supported in part by the Office of Naval Research [Grant N00014-24-1-2705]; the National Science Foundation [Grants DMS-2243650, CCF-2308597, CCF-2311275, and ECCS-2326591]; the University of California, Davis Center for Data Science and Artificial Intelligence Research Innovative Data Science Seed Funding Program; and Rice University start-up fund.  more » « less
Award ID(s):
2311275 2326591 2308597 2243650
PAR ID:
10592198
Author(s) / Creator(s):
; ;
Publisher / Repository:
INFORMS
Date Published:
Journal Name:
Mathematics of Operations Research
ISSN:
0364-765X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper, we propose AdaBB, an adaptive gradient method based on the Barzilai-Borwein stepsize. The algorithm is line-search-free and parameter-free, and it essentially provides a convergent variant of the Barzilai-Borwein method for general convex optimization problems. We analyze the ergodic convergence of the objective function value and the convergence of the iterates for solving general convex optimization problems. Compared with existing works along this line of research, our algorithm gives the best lower bounds on the stepsize and the average of the stepsizes. Furthermore, we present extensions of the proposed algorithm for solving locally strongly convex and composite convex optimization problems where the objective function is the sum of a smooth function and a nonsmooth function. In the case of local strong convexity, we achieve linear convergence. Our numerical results also demonstrate very promising potential of the proposed algorithms on some representative examples. Funding: S. Ma is supported by the National Science Foundation [Grants DMS-2243650, CCF-2308597, CCF-2311275, and ECCS-2326591] and a startup fund from Rice University. J. Yang is supported by the National Natural Science Foundation of China [Grants 12431011 and 12371301] and the Natural Science Foundation for Distinguished Young Scholars of Gansu Province [Grant 22JR5RA223]. 
    more » « less
  2. A stochastic algorithm is proposed, analyzed, and tested experimentally for solving continuous optimization problems with nonlinear equality constraints. It is assumed that constraint function and derivative values can be computed but that only stochastic approximations are available for the objective function and its derivatives. The algorithm is of the sequential quadratic optimization variety. Distinguishing features of the algorithm are that it only employs stochastic objective gradient estimates that satisfy a relatively weak set of assumptions (while using neither objective function values nor estimates of them) and that it allows inexact subproblem solutions to be employed, the latter of which is particularly useful in large-scale settings when the matrices defining the subproblems are too large to form and/or factorize. Conditions are imposed on the inexact subproblem solutions that account for the fact that only stochastic objective gradient estimates are employed. Convergence results are established for the method. Numerical experiments show that the proposed method vastly outperforms a stochastic subgradient method and can outperform an alternative sequential quadratic programming algorithm that employs highly accurate subproblem solutions in every iteration. Funding: This material is based upon work supported by the National Science Foundation [Awards CCF-1740796 and CCF-2139735] and the Office of Naval Research [Award N00014-21-1-2532]. 
    more » « less
  3. Riemannian optimization has drawn a lot of attention due to its wide applications in practice. Riemannian stochastic first-order algorithms have been studied in the literature to solve large-scale machine learning problems over Riemannian manifolds. However, most of the existing Riemannian stochastic algorithms require the objective function to be differentiable, and they do not apply to the case where the objective function is nonsmooth. In this paper, we present two Riemannian stochastic proximal gradient methods for minimizing nonsmooth function over the Stiefel manifold. The two methods, named R-ProxSGD and R-ProxSPB, are generalizations of proximal SGD and proximal SpiderBoost in Euclidean setting to the Riemannian setting. Analysis on the incremental first-order oracle (IFO) complexity of the proposed algorithms is provided. Specifically, the R-ProxSPB algorithm finds an ϵ -stationary point with O(ϵ−3) IFOs in the online case, and O(n+n‾√ϵ−2) IFOs in the finite-sum case with n being the number of summands in the objective. Experimental results on online sparse PCA and robust low-rank matrix completion show that our proposed methods significantly outperform the existing methods that use Riemannian subgradient information. 
    more » « less
  4. Spectral clustering is one of the fundamental unsupervised learning methods and is widely used in data analysis. Sparse spectral clustering (SSC) imposes sparsity to the spectral clustering, and it improves the interpretability of the model. One widely adopted model for SSC in the literature is an optimization problem over the Stiefel manifold with nonsmooth and nonconvex objective. Such an optimization problem is very challenging to solve. Existing methods usually solve its convex relaxation or need to smooth its nonsmooth objective using certain smoothing techniques. Therefore, they were not targeting solving the original formulation of SSC. In this paper, we propose a manifold proximal linear method (ManPL) that solves the original SSC formulation without twisting the model. We also extend the algorithm to solve multiple-kernel SSC problems, for which an alternating ManPL algorithm is proposed. Convergence and iteration complexity results of the proposed methods are established. We demonstrate the advantage of our proposed methods over existing methods via clustering of several data sets, including University of California Irvine and single-cell RNA sequencing data sets. 
    more » « less
  5. In this paper, we study the generalized subdifferentials and the Riemannian gradient subconsistency that are the basis for non-Lipschitz optimization on embedded submanifolds of [Formula: see text]. We then propose a Riemannian smoothing steepest descent method for non-Lipschitz optimization on complete embedded submanifolds of [Formula: see text]. We prove that any accumulation point of the sequence generated by the Riemannian smoothing steepest descent method is a stationary point associated with the smoothing function employed in the method, which is necessary for the local optimality of the original non-Lipschitz problem. We also prove that any accumulation point of the sequence generated by our method that satisfies the Riemannian gradient subconsistency is a limiting stationary point of the original non-Lipschitz problem. Numerical experiments are conducted to demonstrate the advantages of Riemannian [Formula: see text] [Formula: see text] optimization over Riemannian [Formula: see text] optimization for finding sparse solutions and the effectiveness of the proposed method. Funding: C. Zhang was supported in part by the National Natural Science Foundation of China [Grant 12171027] and the Natural Science Foundation of Beijing [Grant 1202021]. X. Chen was supported in part by the Hong Kong Research Council [Grant PolyU15300219]. S. Ma was supported in part by the National Science Foundation [Grants DMS-2243650 and CCF-2308597], the UC Davis Center for Data Science and Artificial Intelligence Research Innovative Data Science Seed Funding Program, and a startup fund from Rice University. 
    more » « less