Global convergence of neuron birth-death dynamics

Rotskoff, Grant; Jelassi, S; Bruna, Joan; Vanden-Eijnden, Eric.

Citation Details

Neural networks with a large number of units ad- mit a mean-field description, which has recently served as a theoretical explanation for the favor- able training properties of “overparameterized” models. In this regime, gradient descent obeys a deterministic partial differential equation (PDE) that converges to a globally optimal solution for networks with a single hidden layer under appro- priate assumptions. In this work, we propose a non-local mass transport dynamics that leads to a modified PDE with the same minimizer. We im- plement this non-local dynamics as a stochastic neuronal birth-death process and we prove that it accelerates the rate of convergence in the mean- field limit. We subsequently realize this PDE with two classes of numerical schemes that converge to the mean-field equation, each of which can easily be implemented for neural networks with finite numbers of units. We illustrate our algorithms with two models to provide intuition for the mech- anism through which convergence is accelerated more »

Award ID(s):: 1845360

NSF-PAR ID:: 10159675

Author(s) / Creator(s):: Rotskoff, Grant; Jelassi, S; Bruna, Joan; Vanden-Eijnden, Eric.

Date Published:: 2019-06-12

Journal Name:: International Conference on Machine Learning

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this