skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Convergence and Complexity Guarantee for Inexact First-order Riemannian Optimization Algorithms
We analyze inexact Riemannian gradient descent (RGD) where Riemannian gradients and retractions are inexactly (and cheaply) computed. Our focus is on understanding when inexact RGD converges and what is the complexity in the general nonconvex and constrained setting. We answer these questions in a general framework of tangential Block Majorization-Minimization (tBMM). We establish that tBMM converges to an πœ–-stationary point within 𝑂(πœ–βˆ’2) iterations. Under a mild assumption, the results still hold when the subproblem is solved inexactly in each iteration provided the total optimality gap is bounded. Our general analysis applies to a wide range of classical algorithms with Riemannian constraints including inexact RGD and proximal gradient method on Stiefel manifolds. We numerically validate that tBMM shows improved performance over existing methods when applied to various problems, including nonnegative tensor decomposition with Riemannian constraints, regularized nonnegative matrix factorization, and low-rank matrix recovery problems.  more » « less
Award ID(s):
2206296
PAR ID:
10534718
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Proceedings of Machine Learning Research
Date Published:
Volume:
235
Page Range / eLocation ID:
27376--27398
Format(s):
Medium: X Other: Medium: X
Location:
Vienna, Austria
Sponsoring Org:
National Science Foundation
More Like this
  1. We analyze inexact Riemannian gradient descent (RGD) where Riemannian gradients and retractions are inexactly (and cheaply) computed. Our focus is on understanding when inexact RGD converges and what is the complexity in the general nonconvex and constrained setting. We answer these questions in a general framework of tangential Block Majorization-Minimization (tBMM). We establish that tBMM converges to an $$\epsilon$$-stationary point within $$O(\epsilon^{-2})$$ iterations. Under a mild assumption, the results still hold when the subproblem is solved inexactly in each iteration provided the total optimality gap is bounded. Our general analysis applies to a wide range of classical algorithms with Riemannian constraints including inexact RGD and proximal gradient method on Stiefel manifolds. We numerically validate that tBMM shows improved performance over existing methods when applied to various problems, including nonnegative tensor decomposition with Riemannian constraints, regularized nonnegative matrix factorization, and low-rank matrix recovery problems. 
    more » « less
  2. Numerous applications in machine learning and data analytics can be formulated as equilibrium computation over Riemannian manifolds. Despite the extensive investigation of their Euclidean counterparts, the performance of Riemannian gradient-based algorithms remain opaque and poorly understood. We revisit the original scheme of Riemannian gradient descent (RGD) and analyze it under a geodesic monotonicity assumption, which includes the well-studied geodesically convex-concave min-max optimization problem as a special case. Our main contribution is to show that, despite the phenomenon of distance distortion, the RGD scheme, with a step size that is agnostic to the manifold's curvature, achieves a curvature-independent and linear last-iterate convergence rate in the geodesically strongly monotone setting. To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence in the Riemannian setting has not been considered before. 
    more » « less
  3. Numerous applications in machine learning and data analytics can be formulated as equilibrium computation over Riemannian manifolds. Despite the extensive investigation of their Euclidean counterparts, the performance of Riemannian gradient-based algorithms remain opaque and poorly understood. We revisit the original scheme of Riemannian gradient descent (RGD) and analyze it under a geodesic monotonicity assumption, which includes the well-studied geodesically convex-concave min-max optimization problem as a special case. Our main contribution is to show that, despite the phenomenon of distance distortion, the RGD scheme, with a step size that is agnostic to the manifold’s curvature, achieves a curvature-independent and linear last-iterate convergence rate in the geodesically strongly monotone setting. To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence in the Riemannian setting has not been considered before. 
    more » « less
  4. We present quantum algorithms for sampling from possibly non-logconcave probability distributions expressed as πœ‹(π‘₯)∝exp(βˆ’π›½π‘“(π‘₯)) as well as quantum algorithms for estimating the partition function for such distributions. We also incorporate a stochastic gradient oracle that implements the quantum walk operators inexactly by only using mini-batch gradients when 𝑓 can be written as a finite sum. One challenge of quantizing the resulting Markov chains is that they do not satisfy the detailed balance condition in general. Consequently, the mixing time of the algorithm cannot be expressed in terms of the spectral gap of the transition density matrix, making the quantum algorithms nontrivial to analyze. We overcame these challenges by first building a reference reversible Markov chain that converges to the target distribution, then controlling the discrepancy between our algorithm’s output and the target distribution by using the reference Markov chain as a bridge to establish the total complexity. Our quantum algorithms exhibit polynomial speedups in terms of dimension or precision dependencies when compared to best-known classical algorithms under similar assumptions. 
    more » « less
  5. The paper proposes and develops a novel inexact gradient method (IGD) for minimizing smooth functions with Lipschitzian gradients. We show that the sequence of gradients generated by IGD converges to zero. The convergence of iterates to stationary points is guaranteed under the Kurdyka- Lojasiewicz property of the objective function with convergence rates depending on the KL exponent. The newly developed IGD is applied to designing two novel gradient-based methods of nonsmooth convex optimization such as the inexact proximal point methods (GIPPM) and the inexact augmented Lagrangian method (GIALM) for convex programs with linear equality constraints. These two methods inherit global convergence properties from IGD and are confirmed by numerical experiments to have practical advantages over some well-known algorithms of nonsmooth convex optimization 
    more » « less