SGD and Hogwild! Convergence Without the Bounded Gradients Assumption

Nguyen, Lam M.; Nguyen, Phuong Ha; Dijk, Marten van; Richtárik, Peter; Scheinberg, Katya; Takáč, Martin

Citation Details

Stochastic gradient descent (SGD) is the optimization algorithm of choice in many machine learning applications such as regularized empirical risk minimization and training deep neural networks. The classical analysis of convergence of SGD is carried out under the assumption that the norm of the stochastic gradient is uniformly bounded. While this might hold for some loss functions, it is always violated for cases where the objective function is strongly convex. In (Bottou et al.,2016) a new analysis of convergence of SGD is performed under the assumption that stochastic gradients are bounded with respect to the true gradient norm. Here we show that for stochastic problems arising in machine learning such bound always holds. Moreover, we propose an alternative convergence analysis of SGD with diminishing learning rate regime, which is results in more relaxed conditions that those in (Bottou et al.,2016). We then move on the asynchronous parallel setting, and prove convergence of the Hogwild! algorithm in the same regime, obtaining the first convergence results for this method in the case of diminished learning rate. more »

Award ID(s):: 1618717 1740796

PAR ID:: 10110877

Author(s) / Creator(s):: Nguyen, Lam M.; Nguyen, Phuong Ha; Dijk, Marten van; Richtárik, Peter; Scheinberg, Katya; Takáč, Martin

Date Published:: 2018-01-01

Journal Name:: Proceedings of Machine Learning Research

Volume:: 80

ISSN:: 2640-3498

Page Range / eLocation ID:: 3750-3758

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
The DOI is not currently available.

More Like this