Invariant Low-Dimensional Subspaces in Gradient Descent for Learning Deep Matrix Factorizations

Yaras, Can; Wang, Peng; Hu, Wei; Zhu, Zhihui; Balzano, Laura; Qu, Qing

Citation Details

An extensively studied phenomenon of the past few years in training deep networks is the implicit bias of gradient descent towards parsimonious solutions. In this work, we further investigate this phenomenon by narrowing our focus to deep matrix factorization, where we reveal surprising low-dimensional structures in the learning dynamics when the target matrix is low-rank. Specifically, we show that the evolution of gradient descent starting from arbitrary orthogonal initialization only affects a minimal portion of singular vector spaces across all weight matrices. In other words, the learning process happens only within a small invariant subspace of each weight matrix, despite the fact that all parameters are updated throughout training. From this, we provide rigorous justification for low-rank training in a specific, yet practical setting. In particular, we demonstrate that we can construct compressed factorizations that are equivalent to full-width, deep factorizations throughout training for solving low-rank matrix completion problems efficiently. more »

Award ID(s):: 1845076

PAR ID:: 10502117

Author(s) / Creator(s):: Yaras, Can; Wang, Peng; Hu, Wei; Zhu, Zhihui; Balzano, Laura; Qu, Qing

Publisher / Repository:: NeurIPS 2023 Workshop M3L

Date Published:: 2023-11-06

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Workshop Report:
The DOI is not currently available.

More Like this