Neural networks with a large number of units ad- mit a mean-field description, which has recently served as a theoretical explanation for the favor- able training properties of “overparameterized” models. In this regime, gradient descent obeys a deterministic partial differential equation (PDE) that converges to a globally optimal solution for networks with a single hidden layer under appro- priate assumptions. In this work, we propose a non-local mass transport dynamics that leads to a modified PDE with the same minimizer. We im- plement this non-local dynamics as a stochastic neuronal birth-death process and we prove that it accelerates the rate of convergence in the mean- field limit. We subsequently realize this PDE with two classes of numerical schemes that converge to the mean-field equation, each of which can easily be implemented for neural networks with finite numbers of units. We illustrate our algorithms with two models to provide intuition for the mech- anism through which convergence is accelerated
more »
« less
Cross-scale excitability in networks of quadratic integrate-and-fire neurons
From the action potentials of neurons and cardiac cells to the amplification of calcium signals in oocytes, excitability is a hallmark of many biological signalling processes. In recent years, excitability in single cells has been related to multiple-timescale dynamics through canards , special solutions which determine the effective thresholds of the all-or-none responses. However, the emergence of excitability in large populations remains an open problem. Here, we show that the mechanism of excitability in large networks and mean-field descriptions of coupled quadratic integrate-and-fire (QIF) cells mirrors that of the individual components. We initially exploit the Ott-Antonsen ansatz to derive low-dimensional dynamics for the coupled network and use it to describe the structure of canards via slow periodic forcing. We demonstrate that the thresholds for onset and offset of population firing can be found in the same way as those of the single cell. We combine theoretical analysis and numerical computations to develop a novel and comprehensive framework for excitability in large populations, applicable not only to models amenable to Ott-Antonsen reduction, but also to networks without a closed-form mean-field limit, in particular sparse networks.
more »
« less
- Award ID(s):
- 1951099
- PAR ID:
- 10420009
- Editor(s):
- Blackwell, Kim T.
- Date Published:
- Journal Name:
- PLOS Computational Biology
- Volume:
- 18
- Issue:
- 10
- ISSN:
- 1553-7358
- Page Range / eLocation ID:
- e1010569
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract We analyze the dynamics of finite width effects in wide but finite feature learning neural networks. Starting from a dynamical mean field theory description of infinite width deep neural network kernel and prediction dynamics, we provide a characterization of the fluctuations of the dynamical mean field theory order parameters over random initializations of the network weights. Our results, while perturbative in width, unlike prior analyses, are non-perturbative in the strength of feature learning. We find that once the mean field/µP parameterization is adopted, the leading finite size effect on the dynamics is to introduce initialization variance in the predictions and feature kernels of the networks. In the lazy limit of network training, all kernels are random but static in time and the prediction variance has a universal form. However, in the rich, feature learning regime, the fluctuations of the kernels and predictions are dynamically coupled with a variance that can be computed self-consistently. In two layer networks, we show how feature learning can dynamically reduce the variance of the final tangent kernel and final network predictions. We also show how initialization variance can slow down online learning in wide but finite networks. In deeper networks, kernel variance can dramatically accumulate through subsequent layers at large feature learning strengths, but feature learning continues to improve the signal-to-noise ratio of the feature kernels. In discrete time, we demonstrate that large learning rate phenomena such as edge of stability effects can be well captured by infinite width dynamics and that initialization variance can decrease dynamically. For convolutional neural networks trained on CIFAR-10, we empirically find significant corrections to both the bias and variance of network dynamics due to finite width.more » « less
-
Abstract The relationship between complex brain oscillations and the dynamics of individual neurons is poorly understood. Here we utilize maximum caliber, a dynamical inference principle, to build a minimal yet general model of the collective (mean field) dynamics of large populations of neurons. In agreement with previous experimental observations, we describe a simple, testable mechanism, involving only a single type of neuron, by which many of these complex oscillatory patterns may emerge. Our model predicts that the refractory period of neurons, which has often been neglected, is essential for these behaviors.more » « less
-
The framework of mean-field games (MFGs) is used for modeling the collective dynamics of large populations of non-cooperative decision-making agents. We formulate and analyze a kinetic MFG model for an interacting system of non-cooperative motile agents with inertial dynamics and finite-range interactions, where each agent is minimizing a biologically inspired cost function. By analyzing the associated coupled forward–backward in a time system of nonlinear Fokker–Planck and Hamilton–Jacobi–Bellman equations, we obtain conditions for closed-loop linear stability of the spatially homogeneous MFG equilibrium that corresponds to an ordered state with non-zero mean speed. Using a combination of analysis and numerical simulations, we show that when energetic cost of control is reduced below a critical value, this equilibrium loses stability, and the system transitions to a traveling wave solution. Our work provides a game-theoretic perspective to the problem of collective motion in non-equilibrium biological and bio-inspired systems.more » « less
-
We study a system of coupled phase oscillators near a saddle-node on invariant circle bifurcation and driven by random intrinsic frequencies. Under the variation of control parameters, the system undergoes a phase transition changing the qualitative properties of collective dynamics. Using Ott–Antonsen reduction and geometric techniques for ordinary differential equations, we identify heteroclinic bifurcation in a family of vector fields on a cylinder, which explains the change in collective dynamics. Specifically, we show that heteroclinic bifurcation separates two topologically distinct families of limit cycles: contractible limit cycles before bifurcation from noncontractibile ones after bifurcation. Both families are stable for the model at hand.more » « less
An official website of the United States government

