skip to main content

Title: Universality and individuality in neural dynamics across large populations of recurrent networks
Many recent studies have employed task-based modeling with recurrent neural networks (RNNs) to infer the computational function of different brain regions. These models are often assessed by quantitatively comparing the low-dimensional neural dynamics of the model and the brain, for example using canonical correlation analysis (CCA). However, the nature of the detailed neurobiological inferences one can draw from such efforts remains elusive. For example, to what extent does training neural networks to solve simple tasks, prevalent in neuroscientific studies, uniquely determine the low-dimensional dynamics independent of neural architectures? Or alternatively, are the learned dynamics highly sensitive to different neural architectures? Knowing the answer to these questions has strong implications on whether and how to use task-based RNN modeling to understand brain dynamics. To address these foundational questions, we study populations of thousands of networks of commonly used RNN architectures trained to solve neuroscientifically motivated tasks and characterize their low-dimensional dynamics via CCA and nonlinear dynamical systems analysis. We find the geometry of the dynamics can be highly sensitive to different network architectures, and further find striking dissociations between geometric similarity as measured by CCA and network function, yielding a cautionary tale. Moreover, we find that while the geometry of neural more » dynamics can vary greatly across architectures, the underlying computational scaffold: the topological structure of fixed points, transitions between them, limit cycles, and linearized dynamics, often appears {\it universal} across all architectures. Overall, this analysis of universality and individuality across large populations of RNNs provides a much needed foundation for interpreting quantitative measures of dynamical similarity between RNN and brain dynamics. « less
; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Advances in neural information processing systems
Sponsoring Org:
National Science Foundation
More Like this
  1. Recurrent neural networks (RNNs) are a widely used tool for modeling sequential data, yet they are often treated as inscrutable black boxes. Given a trained recurrent network, we would like to reverse engineer it--to obtain a quantitative, interpretable description of how it solves a particular task. Even for simple tasks, a detailed understanding of how recurrent networks work, or a prescription for how to develop such an understanding, remains elusive. In this work, we use tools from dynamical systems analysis to reverse engineer recurrent networks trained to perform sentiment classification, a foundational natural language processing task. Given a trained network, we find fixed points of the recurrent dynamics and linearize the nonlinear system around these fixed points. Despite their theoretical capacity to implement complex, high-dimensional computations, we find that trained networks converge to highly interpretable, low-dimensional representations. In particular, the topological structure of the fixed points and corresponding linearized dynamics reveal an approximate line attractor within the RNN, which we can use to quantitatively understand how the RNN solves the sentiment analysis task. Finally, we find this mechanism present across RNN architectures (including LSTMs, GRUs, and vanilla RNNs) trained on multiple datasets, suggesting that our findings are not unique tomore »a particular architecture or dataset. Overall, these results demonstrate that surprisingly universal and human interpretable computations can arise across a range of recurrent networks.« less
  2. Gutkin, Boris S. (Ed.)
    Converging evidence suggests the brain encodes time in dynamic patterns of neural activity, including neural sequences, ramping activity, and complex dynamics. Most temporal tasks, however, require more than just encoding time, and can have distinct computational requirements including the need to exhibit temporal scaling, generalize to novel contexts, or robustness to noise. It is not known how neural circuits can encode time and satisfy distinct computational requirements, nor is it known whether similar patterns of neural activity at the population level can exhibit dramatically different computational or generalization properties. To begin to answer these questions, we trained RNNs on two timing tasks based on behavioral studies. The tasks had different input structures but required producing identically timed output patterns. Using a novel framework we quantified whether RNNs encoded two intervals using either of three different timing strategies: scaling, absolute, or stimulus-specific dynamics. We found that similar neural dynamic patterns at the level of single intervals, could exhibit fundamentally different properties, including, generalization, the connectivity structure of the trained networks, and the contribution of excitatory and inhibitory neurons. Critically, depending on the task structure RNNs were better suited for generalization or robustness to noise. Further analysis revealed different connection patterns underlyingmore »the different regimes. Our results predict that apparently similar neural dynamic patterns at the population level (e.g., neural sequences) can exhibit fundamentally different computational properties in regards to their ability to generalize to novel stimuli and their robustness to noise—and that these differences are associated with differences in network connectivity and distinct contributions of excitatory and inhibitory neurons. We also predict that the task structure used in different experimental studies accounts for some of the experimentally observed variability in how networks encode time.« less
  3. The immediate neighborhood of virtually every brain neuron contains thin, meandering axons that release serotonin (5-HT). These axons, also referred to as serotonergic fibers, are present in nearly all studied nervous systems (both vertebrate and invertebrate) and appear to be a key component of biological neural networks. In the mammalian brain, they create dense meshworks that are macroscopically described by densities. It is not known how these densities arise from the trajectories of individual fibers, each of which resembles a unique random-walk path. This poses interesting theoretical questions, solving which will advance our understanding of brain plasticity and regeneration. For example, serotonin-associated psychedelics have recently been shown to promote global brain integration in depression [1], and serotonergic fibers are nearly unique in their ability to robustly regenerate in the adult mammalian brain [2]. We have recently introduced a conceptual framework that treats the serotonergic axons as “stochastic axons.” Stochastic axons are different from axons that support point-to-point connectivity (often studied with graph-theoretical methods) and require novel theoretical approaches. We have shown that serotonergic axons can be potentially modeled as paths of fractional Brownian motion (FBM) in the superdiffusive regime (with the Hurst exponent H > 0.5). Our supercomputing simulations demonstratemore »that particles driven by reflected FBM (rFBM) accumulate at the border enclosing the shape [3]. Likewise, serotonergic fibers tend to accumulate at the border of neural tissue, in addition to their general similarity to simulated FBM paths [4]. This work expands our previous simulations in 2D-brain-like shapes by considering the full 3D-geometry of the brain. This transition is not trivial and cannot be reduced to independent 2D-sections because increments of FBM trajectories exhibit long-range correlation. Supercomputing simulations of rFMB (H > 0.5) were performed in the reconstructed 3D-geometry of a mouse brain at embryonic day 17 (serotonergic fibers are already well developed at this age and begin to invade the cortical plate). The obtained results were compared to the actual distribution of fibers in the mouse brain. In addition, we obtained preliminary results by simulating rFBM with a region-dependent H. This next step in complexity presents challenges (e.g., it can be highly sensitive to mathematical specifications), but it is necessary for the predictive modeling of interior fiber densities in heterogenous brain tissue.« less
  4. Recent advancements in recurrent neural network (RNN) research have demonstrated the superiority of utilizing multiscale structures in learning temporal representations of time series. Currently, most of multiscale RNNs use fixed scales, which do not comply with the nature of dynamical temporal patterns among sequences. In this paper, we propose Adaptively Scaled Recurrent Neural Networks (ASRNN), a simple but efficient way to handle this problem. Instead of using predefined scales, ASRNNs are able to learn and adjust scales based on different temporal contexts, making them more flexible in modeling multiscale patterns. Compared with other multiscale RNNs, ASRNNs are bestowed upon dynamical scaling capabilities with much simpler structures, and are easy to be integrated with various RNN cells. The experiments on multiple sequence modeling tasks indicate ASRNNs can efficiently adapt scales based on different sequence contexts and yield better performances than baselines without dynamical scaling abilities.
  5. Biological agents do not have infinite resources to learn new things. For this reason, a central aspect of human learning is the ability to recycle previously acquired knowledge in a way that allows for faster, less resource-intensive acquisition of new skills. In spite of that, how neural networks in the brain leverage existing knowledge to learn new computations is not well understood. In this work, we study this question in artificial recurrent neural networks (RNNs) trained on a corpus of commonly used neuroscience tasks. Combining brain-inspired inductive biases we call functional and structural, we propose a system that learns new tasks by building on top of pre-trained latent dynamics organised into separate recurrent modules. These modules, acting as prior knowledge acquired previously through evolution or development, are pre-trained on the statistics of the full corpus of tasks so as to be independent and maximally informative. The resulting model, we call a Modular Latent Primitives (MoLaP) network, allows for learning multiple tasks while keeping parameter counts, and updates, low. We also show that the skills acquired with our approach are more robust to a broad range of perturbations compared to those acquired with other multi-task learning strategies, and that generalisation tomore »new tasks is facilitated. This work offers a new perspective on achieving efficient multi-task learning in the brain, illustrating the benefits of leveraging pre-trained latent dynamical primitives.« less