Flat minima generalize for low-rank matrix recovery

Ding, Lijun; Drusvyatskiy, Dmitriy; Fazel, Maryam; Harchaoui, Zaid

doi:10.1093/imaiai/iaae009

Citation Details

Flat minima generalize for low-rank matrix recovery

Abstract Empirical evidence suggests that for a variety of overparameterized nonlinear models, most notably in neural network training, the growth of the loss around a minimizer strongly impacts its performance. Flat minima—those around which the loss grows slowly—appear to generalize well. This work takes a step towards understanding this phenomenon by focusing on the simplest class of overparameterized nonlinear models: those arising in low-rank matrix recovery. We analyse overparameterized matrix and bilinear sensing, robust principal component analysis, covariance matrix estimation and single hidden layer neural networks with quadratic activation functions. In all cases, we show that flat minima, measured by the trace of the Hessian, exactly recover the ground truth under standard statistical assumptions. For matrix completion, we establish weak recovery, although empirical evidence suggests exact recovery holds here as well. We complete the paper with synthetic experiments that illustrate our findings. more »

Award ID(s):: 2306322 2023166

PAR ID:: 10508071

Author(s) / Creator(s):: Ding, Lijun; Drusvyatskiy, Dmitriy; Fazel, Maryam; Harchaoui, Zaid

Publisher / Repository:: Oxford University Press

Date Published:: 2024-04-01

Journal Name:: Information and Inference: A Journal of the IMA

Volume:: 13

Issue:: 2

ISSN:: 2049-8772

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1093/imaiai/iaae009

More Like this