skip to main content


Title: Accelerated Optimization on Riemannian Manifolds via Discrete Constrained Variational Integrators
Abstract

A variational formulation for accelerated optimization on normed vector spaces was recently introduced in Wibisono et al. (PNAS 113:E7351–E7358, 2016), and later generalized to the Riemannian manifold setting in Duruisseaux and Leok (SJMDS, 2022a). This variational framework was exploited on normed vector spaces in Duruisseaux et al. (SJSC 43:A2949–A2980, 2021) using time-adaptive geometric integrators to design efficient explicit algorithms for symplectic accelerated optimization, and it was observed that geometric discretizations which respect the time-rescaling invariance and symplecticity of the Lagrangian and Hamiltonian flows were substantially less prone to stability issues, and were therefore more robust, reliable, and computationally efficient. As such, it is natural to develop time-adaptive Hamiltonian variational integrators for accelerated optimization on Riemannian manifolds. In this paper, we consider the case of Riemannian manifolds embedded in a Euclidean space that can be characterized as the level set of a submersion. We will explore how holonomic constraints can be incorporated in discrete variational integrators to constrain the numerical discretization of the Riemannian Hamiltonian system to the Riemannian manifold, and we will test the performance of the resulting algorithms by solving eigenvalue and Procrustes problems formulated as optimization problems on the unit sphere and Stiefel manifold.

 
more » « less
Award ID(s):
1813635
NSF-PAR ID:
10368289
Author(s) / Creator(s):
;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
Journal of Nonlinear Science
Volume:
32
Issue:
4
ISSN:
0938-8974
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. A variational framework for accelerated optimization was recently introduced on normed vector spaces and Riemannian manifolds in [1] and [2]. It was observed that a careful combination of time-adaptivity and symplecticity in the numerical integration can result in a significant gain in computational efficiency. It is however well known that symplectic integrators lose their near-energy preservation properties when variable time-steps are used. The most common approach to circumvent this problem involves the Poincaré transformation on the Hamiltonian side, and was used in [3] to construct efficient explicit algorithms for symplectic accelerated optimization. However, the current formulations of Hamiltonian variational integrators do not make intrinsic sense on more general spaces such as Riemannian manifolds and Lie groups. In contrast, Lagrangian variational integrators are well-defined on manifolds, so we develop here a framework for time-adaptivity in Lagrangian variational integrators and use the resulting geometric integrators to solve optimization problems on vector spaces and Lie groups. 
    more » « less
  2. Abstract

    Adjoint systems are widely used to inform control, optimization, and design in systems described by ordinary differential equations or differential-algebraic equations. In this paper, we explore the geometric properties and develop methods for such adjoint systems. In particular, we utilize symplectic and presymplectic geometry to investigate the properties of adjoint systems associated with ordinary differential equations and differential-algebraic equations, respectively. We show that the adjoint variational quadratic conservation laws, which are key to adjoint sensitivity analysis, arise from (pre)symplecticity of such adjoint systems. We discuss various additional geometric properties of adjoint systems, such as symmetries and variational characterizations. For adjoint systems associated with a differential-algebraic equation, we relate the index of the differential-algebraic equation to the presymplectic constraint algorithm of Gotay et al. (J Math Phys 19(11):2388–2399, 1978). As an application of this geometric framework, we discuss how the adjoint variational quadratic conservation laws can be used to compute sensitivities of terminal or running cost functions. Furthermore, we develop structure-preserving numerical methods for such systems using Galerkin Hamiltonian variational integrators (Leok and Zhang in IMA J. Numer. Anal. 31(4):1497–1532, 2011) which admit discrete analogues of these quadratic conservation laws. We additionally show that such methods are natural, in the sense that reduction, forming the adjoint system, and discretization all commute, for suitable choices of these processes. We utilize this naturality to derive a variational error analysis result for the presymplectic variational integrator that we use to discretize the adjoint DAE system. Finally, we discuss the application of adjoint systems in the context of optimal control problems, where we prove a similar naturality result.

     
    more » « less
  3. Geometric numerical integration has recently been exploited to design symplectic accelerated optimization algorithms by simulating the Bregman Lagrangian and Hamiltonian systems from the variational framework introduced by Wibisono et al. In this paper, we discuss practical considerations which can significantly boost the computational performance of these optimization algorithms and considerably simplify the tuning process. In particular, we investigate how momentum restarting schemes ameliorate computational efficiency and robustness by reducing the undesirable effect of oscillations and ease the tuning process by making time-adaptivity superfluous. We also discuss how temporal looping helps avoiding instability issues caused by numerical precision, without harming the computational efficiency of the algorithms. Finally, we compare the efficiency and robustness of different geometric integration techniques and study the effects of the different parameters in the algorithms to inform and simplify tuning in practice. From this paper emerge symplectic accelerated optimization algorithms whose computational efficiency, stability and robustness have been improved, and which are now much simpler to use and tune for practical applications. 
    more » « less
  4. From optimal transport to robust dimensionality reduction, a plethora of machine learning applications can be cast into the min-max optimization problems over Riemannian manifolds. Though many min-max algorithms have been analyzed in the Euclidean setting, it has proved elusive to translate these results to the Riemannian case. Zhang et al. [2022] have recently shown that geodesic convex concave Riemannian problems always admit saddle-point solutions. Inspired by this result, we study whether a performance gap between Riemannian and optimal Euclidean space convex-concave algorithms is necessary. We answer this question in the negative—we prove that the Riemannian corrected extragradient (RCEG) method achieves last-iterate convergence at a linear rate in the geodesically strongly-convex-concave case, matching the Euclidean result. Our results also extend to the stochastic or non-smooth case where RCEG and Riemanian gradient ascent descent (RGDA) achieve near-optimal convergence rates up to factors depending on curvature of the manifold. 
    more » « less
  5. We propose a structured prediction approach for robot imitation learning from demonstrations. Among various tools for robot imitation learning, supervised learning has been observed to have a prominent role. Structured prediction is a form of supervised learning that enables learning models to operate on output spaces with complex structures. Through the lens of structured prediction, we show how robots can learn to imitate trajectories belonging to not only Euclidean spaces but also Riemannian manifolds. Exploiting ideas from information theory, we propose a class of loss functions based on the f-divergence to measure the information loss between the demonstrated and reproduced probabilistic trajectories. Different types of f-divergence will result in different policies, which we call imitation modes. Furthermore, our approach enables the incorporation of spatial and temporal trajectory modulation, which is necessary for robots to be adaptive to the change in working conditions. We benchmark our algorithm against state-of-the-art methods in terms of trajectory reproduction and adaptation. The quantitative evaluation shows that our approach outperforms other algorithms regarding both accuracy and efficiency. We also report real-world experimental results on learning manifold trajectories in a polishing task with a KUKA LWR robot arm, illustrating the effectiveness of our algorithmic framework.

     
    more » « less