skip to main content


Title: Task-Driven convolutional recurrent models of the visual system.
Feed-forward convolutional neural networks (CNNs) are currently state-of-the-art for object classification tasks such as ImageNet. Further, they are quantitatively accurate models of temporally-averaged responses of neurons in the primate brain's visual system. However, biological visual systems have two ubiquitous architectural features not shared with typical CNNs: local recurrence within cortical areas, and long-range feedback from downstream areas to upstream areas. Here we explored the role of recurrence in improving classification performance. We found that standard forms of recurrence (vanilla RNNs and LSTMs) do not perform well within deep CNNs on the ImageNet task. In contrast, novel cells that incorporated two structural features, bypassing and gating, were able to boost task accuracy substantially. We extended these design principles in an automated search over thousands of model architectures, which identified novel local recurrent cells and long-range feedback connections useful for object recognition. Moreover, these task-optimized ConvRNNs matched the dynamics of neural activity in the primate visual system better than feedforward networks, suggesting a role for the brain's recurrent connections in performing difficult visual behaviors.  more » « less
Award ID(s):
1703161
NSF-PAR ID:
10082848
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
Advances in Neural Information Processing Systems 2018
Page Range / eLocation ID:
5295-5306
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. INTRODUCTION A brainwide, synaptic-resolution connectivity map—a connectome—is essential for understanding how the brain generates behavior. However because of technological constraints imaging entire brains with electron microscopy (EM) and reconstructing circuits from such datasets has been challenging. To date, complete connectomes have been mapped for only three organisms, each with several hundred brain neurons: the nematode C. elegans , the larva of the sea squirt Ciona intestinalis , and of the marine annelid Platynereis dumerilii . Synapse-resolution circuit diagrams of larger brains, such as insects, fish, and mammals, have been approached by considering select subregions in isolation. However, neural computations span spatially dispersed but interconnected brain regions, and understanding any one computation requires the complete brain connectome with all its inputs and outputs. RATIONALE We therefore generated a connectome of an entire brain of a small insect, the larva of the fruit fly, Drosophila melanogaster. This animal displays a rich behavioral repertoire, including learning, value computation, and action selection, and shares homologous brain structures with adult Drosophila and larger insects. Powerful genetic tools are available for selective manipulation or recording of individual neuron types. In this tractable model system, hypotheses about the functional roles of specific neurons and circuit motifs revealed by the connectome can therefore be readily tested. RESULTS The complete synaptic-resolution connectome of the Drosophila larval brain comprises 3016 neurons and 548,000 synapses. We performed a detailed analysis of the brain circuit architecture, including connection and neuron types, network hubs, and circuit motifs. Most of the brain’s in-out hubs (73%) were postsynaptic to the learning center or presynaptic to the dopaminergic neurons that drive learning. We used graph spectral embedding to hierarchically cluster neurons based on synaptic connectivity into 93 neuron types, which were internally consistent based on other features, such as morphology and function. We developed an algorithm to track brainwide signal propagation across polysynaptic pathways and analyzed feedforward (from sensory to output) and feedback pathways, multisensory integration, and cross-hemisphere interactions. We found extensive multisensory integration throughout the brain and multiple interconnected pathways of varying depths from sensory neurons to output neurons forming a distributed processing network. The brain had a highly recurrent architecture, with 41% of neurons receiving long-range recurrent input. However, recurrence was not evenly distributed and was especially high in areas implicated in learning and action selection. Dopaminergic neurons that drive learning are amongst the most recurrent neurons in the brain. Many contralateral neurons, which projected across brain hemispheres, were in-out hubs and synapsed onto each other, facilitating extensive interhemispheric communication. We also analyzed interactions between the brain and nerve cord. We found that descending neurons targeted a small fraction of premotor elements that could play important roles in switching between locomotor states. A subset of descending neurons targeted low-order post-sensory interneurons likely modulating sensory processing. CONCLUSION The complete brain connectome of the Drosophila larva will be a lasting reference study, providing a basis for a multitude of theoretical and experimental studies of brain function. The approach and computational tools generated in this study will facilitate the analysis of future connectomes. Although the details of brain organization differ across the animal kingdom, many circuit architectures are conserved. As more brain connectomes of other organisms are mapped in the future, comparisons between them will reveal both common and therefore potentially optimal circuit architectures, as well as the idiosyncratic ones that underlie behavioral differences between organisms. Some of the architectural features observed in the Drosophila larval brain, including multilayer shortcuts and prominent nested recurrent loops, are found in state-of-the-art artificial neural networks, where they can compensate for a lack of network depth and support arbitrary, task-dependent computations. Such features could therefore increase the brain’s computational capacity, overcoming physiological constraints on the number of neurons. Future analysis of similarities and differences between brains and artificial neural networks may help in understanding brain computational principles and perhaps inspire new machine learning architectures. The connectome of the Drosophila larval brain. The morphologies of all brain neurons, reconstructed from a synapse-resolution EM volume, and the synaptic connectivity matrix of an entire brain. This connectivity information was used to hierarchically cluster all brains into 93 cell types, which were internally consistent based on morphology and known function. 
    more » « less
  2. Zhou, Dongzhuo Douglas (Ed.)
    This paper uses mathematical modeling to study the mechanisms of surround suppression in the primate visual cortex. We present a large-scale neural circuit model consisting of three interconnected components: LGN and two input layers (Layer 4Ca and Layer 6) of the primary visual cortex V1, covering several hundred hypercolumns. Anatomical structures are incorporated and physiological parameters from realistic modeling work are used. The remaining parameters are chosen to produce model outputs that emulate experimentally observed size-tuning curves. Our two main results are: (i) we discovered the character of the long-range connections in Layer 6 responsible for surround effects in the input layers; and (ii) we showed that a net-inhibitory feedback, i.e., feedback that excites I-cells more than E-cells, from Layer 6 to Layer 4 is conducive to producing surround properties consistent with experimental data. These results are obtained through parameter selection and model analysis. The effects of nonlinear recurrent excitation and inhibition are also discussed. A feature that distinguishes our model from previous modeling work on surround suppression is that we have tried to reproduce realistic lengthscales that are crucial for quantitative comparison with data. Due to its size and the large number of unknown parameters, the model is computationally challenging. We demonstrate a strategy that involves first locating baseline values for relevant parameters using a linear model, followed by the introduction of nonlinearities where needed. We find such a methodology effective, and propose it as a possibility in the modeling of complex biological systems. 
    more » « less
  3. Zhou, D. (Ed.)
    This paper uses mathematical modeling to study the mechanisms of surround suppression in the primate visual cortex. We present a large-scale neural circuit model consisting of three interconnected components: LGN and two input layers (Layer 4Ca and Layer 6) of the primary visual cortex V1, covering several hundred hypercolumns. Anatomical structures are incorporated and physiological parameters from realistic modeling work are used. The remaining parameters are chosen to produce model outputs that emulate experimentally observed size-tuning curves. Our two main results are: (i) we discovered the character of the long-range connections in Layer 6 responsible for surround effects in the input layers; and (ii) we showed that a net-inhibitory feedback, i.e., feedback that excites I-cells more than E-cells, from Layer 6 to Layer 4 is conducive to producing surround properties consis- tent with experimental data. These results are obtained through parameter selection and model analysis. The effects of nonlinear recurrent excitation and inhibition are also dis- cussed. A feature that distinguishes our model from previous modeling work on surround suppression is that we have tried to reproduce realistic lengthscales that are crucial for quantitative comparison with data. Due to its size and the large number of unknown parame- ters, the model is computationally challenging. We demonstrate a strategy that involves first locating baseline values for relevant parameters using a linear model, followed by the intro- duction of nonlinearities where needed. We find such a methodology effective, and propose it as a possibility in the modeling of complex biological systems. 
    more » « less
  4. In recent years, Convolutional Neural Networks (CNNs) have shown superior capability in visual learning tasks. While accuracy-wise CNNs provide unprecedented performance, they are also known to be computationally intensive and energy demanding for modern computer systems. In this paper, we propose Virtual Pooling (ViP), a model-level approach to improve speed and energy consumption of CNN-based image classification and object detection tasks, with a provable error bound. We show the efficacy of ViP through experiments on four CNN models, three representative datasets, both desktop and mobile platforms, and two visual learning tasks, i.e., image classification and object detection. For example, ViP delivers 2.1x speedup with less than 1.5% accuracy degradation in ImageNet classification on VGG16, and 1.8x speedup with 0.025 mAP degradation in PASCAL VOC object detection with Faster-RCNN. ViP also reduces mobile GPU and CPU energy consumption by up to 55% and 70%, respectively. As a complementary method to existing acceleration approaches, ViP achieves 1.9x speedup on ThiNet leading to a combined speedup of 5.23x on VGG16. Furthermore, ViP provides a knob for machine learning practitioners to generate a set of CNN models with varying trade-offs between system speed/energy consumption and accuracy to better accommodate the requirements of their tasks. Code is available at https://github.com/cmu-enyac/VirtualPooling. 
    more » « less
  5. 3D Convolutional Neural Networks (3D-CNN) have been used for object recognition based on the voxelized shape of an object. However, interpreting the decision making process of these 3D-CNNs is still an infeasible task. In this paper, we present a unique 3D-CNN based Gradient-weighted Class Activation Mapping method (3D-GradCAM) for visual explanations of the distinct local geometric features of interest within an object. To enable efficient learning of 3D geometries, we augment the voxel data with surface normals of the object boundary. We then train a 3D-CNN with this augmented data and identify the local features critical for decision-making using 3D GradCAM. An application of this feature identification framework is to recognize difficult-to-manufacture drilled hole features in a complex CAD geometry. The framework can be extended to identify difficult-to-manufacture features at multiple spatial scales leading to a real-time design for manufacturability decision support system. 
    more » « less