- Award ID(s):
- 1653589
- NSF-PAR ID:
- 10107650
- Date Published:
- Journal Name:
- Neural Computation
- Volume:
- 31
- Issue:
- 5
- ISSN:
- 0899-7667
- Page Range / eLocation ID:
- 943 to 979
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Aljadeff, Johnatan (Ed.)Neural circuits exhibit complex activity patterns, both spontaneously and evoked by external stimuli. Information encoding and learning in neural circuits depend on how well time-varying stimuli can control spontaneous network activity. We show that in firing-rate networks in the balanced state, external control of recurrent dynamics, i.e., the suppression of internally-generated chaotic variability, strongly depends on correlations in the input. A distinctive feature of balanced networks is that, because common external input is dynamically canceled by recurrent feedback, it is far more difficult to suppress chaos with common input into each neuron than through independent input. To study this phenomenon, we develop a non-stationary dynamic mean-field theory for driven networks. The theory explains how the activity statistics and the largest Lyapunov exponent depend on the frequency and amplitude of the input, recurrent coupling strength, and network size, for both common and independent input. We further show that uncorrelated inputs facilitate learning in balanced networks.more » « less
-
Zhang, Yanqing (Ed.)
Learning from complex, multidimensional data has become central to computational mathematics, and among the most successful high-dimensional function approximators are deep neural networks (DNNs). Training DNNs is posed as an optimization problem to learn network weights or parameters that well-approximate a mapping from input to target data. Multiway data or tensors arise naturally in myriad ways in deep learning, in particular as input data and as high-dimensional weights and features extracted by the network, with the latter often being a bottleneck in terms of speed and memory. In this work, we leverage tensor representations and processing to efficiently parameterize DNNs when learning from high-dimensional data. We propose tensor neural networks (t-NNs), a natural extension of traditional fully-connected networks, that can be trained efficiently in a reduced, yet more powerful parameter space. Our t-NNs are built upon matrix-mimetic tensor-tensor products, which retain algebraic properties of matrix multiplication while capturing high-dimensional correlations. Mimeticity enables t-NNs to inherit desirable properties of modern DNN architectures. We exemplify this by extending recent work on stable neural networks, which interpret DNNs as discretizations of differential equations, to our multidimensional framework. We provide empirical evidence of the parametric advantages of t-NNs on dimensionality reduction using autoencoders and classification using fully-connected and stable variants on benchmark imaging datasets MNIST and CIFAR-10.
-
Many theories assume that a sensory neuron’s higher firing rate indicates a greater probability of its preferred stimulus. However, this contradicts 1) the adaptation phenomena where prolonged exposure to, and thus increased probability of, a stimulus reduces the firing rates of cells tuned to the stimulus; and 2) the observation that unexpected (low probability) stimuli capture attention and increase neuronal firing. Other theories posit that the brain builds predictive/efficient codes for reconstructing sensory inputs. However, they cannot explain that the brain preserves some information while discarding other. We propose that in sensory areas, projection neurons’ firing rates are proportional to optimal code length (i.e., negative log estimated probability), and their spike patterns are the code, for useful features in inputs. This hypothesis explains adaptation-induced changes of V1 orientation tuning curves, and bottom-up attention. We discuss how the modern minimum-description-length (MDL) principle may help understand neural codes. Because regularity extraction is relative to a model class (defined by cells) via its optimal universal code (OUC), MDL matches the brain’s purposeful, hierarchical processing without input reconstruction. Such processing enables input compression/understanding even when model classes do not contain true models. Top-down attention modifies lower-level OUCs via feedback connections to enhance transmission of behaviorally relevant information. Although OUCs concern lossless data compression, we suggest possible extensions to lossy, prefix-free neural codes for prompt, online processing of most important aspects of stimuli while minimizing behaviorally relevant distortion. Finally, we discuss how neural networks might learn MDL’s normalized maximum likelihood (NML) distributions from input data.more » « less
-
It is currently known how to characterize functions that neural networks can learn with SGD for two extremal parametrizations: neural networks in the linear regime, and neural networks with no structural constraints. However, for the main parametrization of interest —non-linear but regular networks— no tight characterization has yet been achieved, despite significant developments. We take a step in this direction by considering depth-2 neural networks trained by SGD in the mean-field regime. We consider functions on binary inputs that depend on a latent low-dimensional subspace (i.e., small number of coordinates). This regime is of interest since it is poorly under- stood how neural networks routinely tackle high-dimensional datasets and adapt to latent low- dimensional structure without suffering from the curse of dimensionality. Accordingly, we study SGD-learnability with O(d) sample complexity in a large ambient dimension d. Our main results characterize a hierarchical property —the merged-staircase property— that is both necessary and nearly sufficient for learning in this setting. We further show that non-linear training is necessary: for this class of functions, linear methods on any feature map (e.g., the NTK) are not capable of learning efficiently. The key tools are a new “dimension-free” dynamics approximation result that applies to functions defined on a latent space of low-dimension, a proof of global convergence based on polynomial identity testing, and an improvement of lower bounds against linear methods for non-almost orthogonal functions.more » « less
-
Abstract Objective. Many neural systems display spontaneous, spatiotemporal patterns of neural activity that are crucial for information processing. While these cascading patterns presumably arise from the underlying network of synaptic connections between neurons, the precise contribution of the network’s local and global connectivity to these patterns and information processing remains largely unknown.Approach. Here, we demonstrate how network structure supports information processing through network dynamics in empirical and simulated spiking neurons using mathematical tools from linear systems theory, network control theory, and information theory.Main results. In particular, we show that activity, and the information that it contains, travels through cycles in real and simulated networks.Significance. Broadly, our results demonstrate how cascading neural networks could contribute to cognitive faculties that require lasting activation of neuronal patterns, such as working memory or attention.