On the fast convergence of minibatch heavy ball momentum

Bollapragada, Raghu; Chen, Tyler; Ward, Rachel

doi:10.1093/imanum/drae033

Citation Details

On the fast convergence of minibatch heavy ball momentum

Abstract Simple stochastic momentum methods are widely used in machine learning optimization, but their good practical performance is at odds with an absence of theoretical guarantees of acceleration in the literature. In this work, we aim to close the gap between theory and practice by showing that stochastic heavy ball momentum retains the fast linear rate of (deterministic) heavy ball momentum on quadratic optimization problems, at least when minibatching with a sufficiently large batch size. The algorithm we study can be interpreted as an accelerated randomized Kaczmarz algorithm with minibatching and heavy ball momentum. The analysis relies on carefully decomposing the momentum transition matrix, and using new spectral norm concentration bounds for products of independent random matrices. We provide numerical illustrations demonstrating that our bounds are reasonably sharp. more »

Award ID(s):: 2324643

PAR ID:: 10532135

Author(s) / Creator(s):: Bollapragada, Raghu; Chen, Tyler; Ward, Rachel

Publisher / Repository:: Oxford University Press

Date Published:: 2024-08-08

Journal Name:: IMA Journal of Numerical Analysis

Volume:: 45

Issue:: 3

ISSN:: 0272-4979

Format(s):: Medium: X Size: p. 1397-1424

Size(s):: p. 1397-1424

Sponsoring Org:: National Science Foundation

Journal Article:
https://doi.org/10.1093/imanum/drae033

More Like this