Preconditioned Gradient Descent for Overparameterized Nonconvex Burer--Monteiro Factorization with Global Optimality Certification

Gavin Zhang, Salar Fattahi

Citation Details

We consider using gradient descent to minimize the nonconvex function $$f(X)=\phi(XX^{T})$$ over an $$n\times r$$ factor matrix $$X$$, in which $$\phi$$ is an underlying smooth convex cost function defined over $$n\times n$$ matrices. While only a second-order stationary point $$X$$ can be provably found in reasonable time, if $$X$$ is additionally \emph{rank deficient}, then its rank deficiency certifies it as being globally optimal. This way of certifying global optimality necessarily requires the search rank $$r$$ of the current iterate $$X$$ to be \emph{overparameterized} with respect to the rank $$r^{\star}$ of the global minimizer $$X^{\star}$$. Unfortunately, overparameterization significantly slows down the convergence of gradient descent, from a linear rate with $$r=r^{\star}$$ to a sublinear rate when $$r>r^{\star}$$, even when $$\phi$$ is strongly convex. In this paper, we propose an inexpensive preconditioner that restores the convergence rate of gradient descent back to linear in the overparameterized case, while also making it agnostic to possible ill-conditioning in the global minimizer $$X^{\star}$$. more »

Award ID(s):: 2047462

PAR ID:: 10488545

Author(s) / Creator(s):: Gavin Zhang, Salar Fattahi

Publisher / Repository:: Journal of Machine Learning Research

Date Published:: 2023-04-01

Journal Name:: Journal of machine learning research

ISSN:: 1533-7928

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
The DOI is not currently available.

More Like this