Stochastic Gradient and Langevin Processes

Cheng, Xiang; Yin, Dong; Bartlett, Peter L.; Jordan, Michael

Citation Details

We prove quantitative convergence rates at which discrete Langevin-like processes converge to the invariant distribution of a related stochastic differential equation. We study the setup where the additive noise can be non-Gaussian and state-dependent and the potential function can be non-convex. We show that the key properties of these processes depend on the potential function and the second moment of the additive noise. We apply our theoretical findings to studying the convergence of Stochastic Gradient Descent (SGD) for non-convex problems and corroborate them with experiments using SGD to train deep neural networks on the CIFAR-10 dataset. more »

Award ID(s):: 1909365

PAR ID:: 10250954

Author(s) / Creator(s):: Cheng, Xiang; Yin, Dong; Bartlett, Peter L.; Jordan, Michael

Editor(s):: Daumé III, Hal; Singh, Aarti

Date Published:: 2020-01-01

Journal Name:: Proceedings of the 37th International Conference on Machine Learning

Volume:: 119

Page Range / eLocation ID:: 1810-1819

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this