skip to main content

Title: Depth-Separation with Multilayer Mean-Field Networks
Mean-field limit has been successfully applied to neural networks, leading to many results in optimizing overparametrized networks. However, existing works often focus on two-layer networks and/or require large number of neurons. We give a new framework for extending the mean-field limit to multilayer network, and show that a polynomial-size three-layer network in our framework can learn the function constructed by Safran et al. (2019) – which is known to be not approximable by any two-layer networks  more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
International Conference on Learning Representations
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Neural networks with a large number of units ad- mit a mean-field description, which has recently served as a theoretical explanation for the favor- able training properties of “overparameterized” models. In this regime, gradient descent obeys a deterministic partial differential equation (PDE) that converges to a globally optimal solution for networks with a single hidden layer under appro- priate assumptions. In this work, we propose a non-local mass transport dynamics that leads to a modified PDE with the same minimizer. We im- plement this non-local dynamics as a stochastic neuronal birth-death process and we prove that it accelerates the rate of convergence in the mean- field limit. We subsequently realize this PDE with two classes of numerical schemes that converge to the mean-field equation, each of which can easily be implemented for neural networks with finite numbers of units. We illustrate our algorithms with two models to provide intuition for the mech- anism through which convergence is accelerated 
    more » « less
  2. Abstract

    The multilayer network framework has served to describe and uncover a number of novel and unforeseen physical behaviors and regimes in interacting complex systems. However, the majority of existing studies are built on undirected multilayer networks while most complex systems in nature exhibit directed interactions. Here, we propose a framework to analyze diffusive dynamics on multilayer networks consisting of at least one directed layer. We rigorously demonstrate that directionality in multilayer networks can fundamentally change the behavior of diffusive dynamics: from monotonic (in undirected systems) to non-monotonic diffusion with respect to the interlayer coupling strength. Moreover, for certain multilayer network configurations, the directionality can induce a unique superdiffusion regime for intermediate values of the interlayer coupling, wherein the diffusion is even faster than that corresponding to the theoretical limit for undirected systems, i.e. the diffusion in the integrated network obtained from the aggregation of each layer. We theoretically and numerically show that the existence of superdiffusion is fully determined by the directionality of each layer and the topological overlap between layers. We further provide a formulation of multilayer networks displaying superdiffusion. Our results highlight the significance of incorporating the interacting directionality in multilevel networked systems and provide a framework to analyze dynamical processes on interconnected complex systems with directionality.

    more » « less
  3. Beck, Jeff (Ed.)
    Characterizing metastable neural dynamics in finite-size spiking networks remains a daunting challenge. We propose to address this challenge in the recently introduced replica-mean-field (RMF) limit. In this limit, networks are made of infinitely many replicas of the finite network of interest, but with randomized interactions across replicas. Such randomization renders certain excitatory networks fully tractable at the cost of neglecting activity correlations, but with explicit dependence on the finite size of the neural constituents. However, metastable dynamics typically unfold in networks with mixed inhibition and excitation. Here, we extend the RMF computational framework to point-process-based neural network models with exponential stochastic intensities, allowing for mixed excitation and inhibition. Within this setting, we show that metastable finite-size networks admit multistable RMF limits, which are fully characterized by stationary firing rates. Technically, these stationary rates are determined as the solutions of a set of delayed differential equations under certain regularity conditions that any physical solutions shall satisfy. We solve this original problem by combining the resolvent formalism and singular-perturbation theory. Importantly, we find that these rates specify probabilistic pseudo-equilibria which accurately capture the neural variability observed in the original finite-size network. We also discuss the emergence of metastability as a stochastic bifurcation, which can be interpreted as a static phase transition in the RMF limits. In turn, we expect to leverage the static picture of RMF limits to infer purely dynamical features of metastable finite-size networks, such as the transition rates between pseudo-equilibria. 
    more » « less
  4. Blackwell, Kim T. (Ed.)
    From the action potentials of neurons and cardiac cells to the amplification of calcium signals in oocytes, excitability is a hallmark of many biological signalling processes. In recent years, excitability in single cells has been related to multiple-timescale dynamics through canards , special solutions which determine the effective thresholds of the all-or-none responses. However, the emergence of excitability in large populations remains an open problem. Here, we show that the mechanism of excitability in large networks and mean-field descriptions of coupled quadratic integrate-and-fire (QIF) cells mirrors that of the individual components. We initially exploit the Ott-Antonsen ansatz to derive low-dimensional dynamics for the coupled network and use it to describe the structure of canards via slow periodic forcing. We demonstrate that the thresholds for onset and offset of population firing can be found in the same way as those of the single cell. We combine theoretical analysis and numerical computations to develop a novel and comprehensive framework for excitability in large populations, applicable not only to models amenable to Ott-Antonsen reduction, but also to networks without a closed-form mean-field limit, in particular sparse networks. 
    more » « less
  5. We give the first provably efficient algorithm for learning a one hidden layer convolutional network with respect to a general class of (potentially overlapping) patches. Additionally, our algorithm requires only mild conditions on the underlying distribution. We prove that our framework captures commonly used schemes from computer vision, including one-dimensional and two-dimensional "patch and stride" convolutions. Our algorithm-- Convotron -- is inspired by recent work applying isotonic regression to learning neural networks. Convotron uses a simple, iterative update rule that is stochastic in nature and tolerant to noise (requires only that the conditional mean function is a one layer convolutional network, as opposed to the realizable setting). In contrast to gradient descent, Convotron requires no special initialization or learning-rate tuning to converge to the global optimum. We also point out that learning one hidden convolutional layer with respect to a Gaussian distribution and just one disjoint patch P (the other patches may be arbitrary) is easy in the following sense: Convotron can efficiently recover the hidden weight vector by updating only in the direction of P. 
    more » « less