Lean and Mean Adaptive Optimization via Subset-Norm and Subspace-Momentum with Convergence Guarantees

Nguyen, Thien H; Nguyen, Huy L

Citation Details

This content will become publicly available on July 13, 2026

Lean and Mean Adaptive Optimization via Subset-Norm and Subspace-Momentum with Convergence Guarantees

We introduce two complementary techniques for efficient optimization that reduce memory requirements while accelerating training of large-scale neural networks. The first technique, Subset-Norm step size, generalizes AdaGrad-Norm and AdaGrad(-Coordinate) through step-size sharing. Subset-Norm (SN) reduces AdaGrad’s memory footprint from O(d) to O(sqrt(d)), where d is the model size. For non-convex smooth objectives under coordinate-wise sub-gaussian noise, we show a noise-adapted high-probability convergence guarantee with improved dimensional dependence of SN over existing methods. Our second technique, Subspace-Momentum, reduces the momentum state’s memory footprint by restricting momentum to a low-dimensional subspace while performing SGD in the orthogonal complement. We prove a high-probability convergence result for Subspace-Momentum under standard assumptions. Empirical evaluation on pre-training and fine-tuning LLMs demonstrates the effectiveness of our methods. For instance, combining Subset-Norm with Subspace-Momentum achieves Adam’s validation perplexity for LLaMA 1B in approximately half the training tokens (6.8B vs 13.1B) while reducing Adam’s optimizer-states memory footprint by more than 80% with minimal additional hyperparameter tuning. more »

Award ID(s):: 2311649

PAR ID:: 10599719

Author(s) / Creator(s):: Nguyen, Thien H; Nguyen, Huy L

Publisher / Repository:: Proceedings of Machine Learning Research

Date Published:: 2025-07-13

Journal Name:: Proceedings of Machine Learning Research

ISSN:: 2640-3498

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on July 13, 2026
Conference Paper:
The DOI is not currently available.

More Like this