Toward Global Convergence of Gradient EM for Over-Parameterized Gaussian Mixture Models

Xu, Weihang; Fazel, Maryam; Du, Simon S

Citation Details

We study the gradient Expectation-Maximization (EM) algorithm for Gaussian Mixture Models (GMM) in the over-parameterized setting, where a general GMM with n > 1 components learns from data that are generated by a single ground truth Gaussian distribution. While results for the special case of 2-Gaussian mixtures are well-known, a general global convergence analysis for arbitrary n remains unresolved and faces several new technical barriers since the convergence becomes sub-linear and non-monotonic. To address these challenges, we construct a novel likelihood-based convergence analysis framework and rigorously prove that gradient EM converges globally with a sublinear rate O(1/\sqrt{t}). This is the first global convergence result for Gaussian mixtures with more than 2 components. The sublinear convergence rate is due to the algorithmic nature of learning over- parameterized GMM with gradient EM. We also identify a new emerging technical challenge for learning general over-parameterized GMM: the existence of bad local regions that can trap gradient EM for an exponential number of steps. more »

Award ID(s):: 2023166

PAR ID:: 10632234

Author(s) / Creator(s):: Xu, Weihang; Fazel, Maryam; Du, Simon S

Publisher / Repository:: Advances in Neural Information Processing Systems

Date Published:: 2024-09-25

Volume:: 37

ISBN:: 9798331314385

Format(s):: Medium: X

Location:: Vancouver, Canada

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this