skip to main content


Title: Deep Neural Model Inspection and Comparison via Functional Neuron Pathways
We introduce a general method for the interpretation and comparison of neural models. The method is used to factor a complex neural model into its functional components, which are comprised of sets of co-firing neurons that cut across layers of the network architecture, and which we call neural pathways. The function of these pathways can be understood by identifying correlated task level and linguistic heuristics in such a way that this knowledge acts as a lens for approximating what the network has learned to apply to its intended task. As a case study for investigating the utility of these pathways, we present an examination of pathways identified in models trained for two standard tasks, namely Named Entity Recognition and Recognizing Textual Entailment.  more » « less
Award ID(s):
1822831 1443068
NSF-PAR ID:
10174703
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Page Range / eLocation ID:
5754–5764
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We introduce a general method for the interpretation and comparison of neural models. The method is used to factor a complex neural model into its functional components, which are comprised of sets of co-firing neurons that cut across layers of the network architecture, and which we call neural pathways. The function of these pathways can be understood by identifying correlated task level and linguistic heuristics in such a way that this knowledge acts as a lens for approximating what the network has learned to apply to its intended task. As a case study for investigating the utility of these pathways, we present an examination of pathways identified in models trained for two standard tasks, namely Named Entity Recognition and Recognizing Textual Entailment. 
    more » « less
  2. INTRODUCTION A brainwide, synaptic-resolution connectivity map—a connectome—is essential for understanding how the brain generates behavior. However because of technological constraints imaging entire brains with electron microscopy (EM) and reconstructing circuits from such datasets has been challenging. To date, complete connectomes have been mapped for only three organisms, each with several hundred brain neurons: the nematode C. elegans , the larva of the sea squirt Ciona intestinalis , and of the marine annelid Platynereis dumerilii . Synapse-resolution circuit diagrams of larger brains, such as insects, fish, and mammals, have been approached by considering select subregions in isolation. However, neural computations span spatially dispersed but interconnected brain regions, and understanding any one computation requires the complete brain connectome with all its inputs and outputs. RATIONALE We therefore generated a connectome of an entire brain of a small insect, the larva of the fruit fly, Drosophila melanogaster. This animal displays a rich behavioral repertoire, including learning, value computation, and action selection, and shares homologous brain structures with adult Drosophila and larger insects. Powerful genetic tools are available for selective manipulation or recording of individual neuron types. In this tractable model system, hypotheses about the functional roles of specific neurons and circuit motifs revealed by the connectome can therefore be readily tested. RESULTS The complete synaptic-resolution connectome of the Drosophila larval brain comprises 3016 neurons and 548,000 synapses. We performed a detailed analysis of the brain circuit architecture, including connection and neuron types, network hubs, and circuit motifs. Most of the brain’s in-out hubs (73%) were postsynaptic to the learning center or presynaptic to the dopaminergic neurons that drive learning. We used graph spectral embedding to hierarchically cluster neurons based on synaptic connectivity into 93 neuron types, which were internally consistent based on other features, such as morphology and function. We developed an algorithm to track brainwide signal propagation across polysynaptic pathways and analyzed feedforward (from sensory to output) and feedback pathways, multisensory integration, and cross-hemisphere interactions. We found extensive multisensory integration throughout the brain and multiple interconnected pathways of varying depths from sensory neurons to output neurons forming a distributed processing network. The brain had a highly recurrent architecture, with 41% of neurons receiving long-range recurrent input. However, recurrence was not evenly distributed and was especially high in areas implicated in learning and action selection. Dopaminergic neurons that drive learning are amongst the most recurrent neurons in the brain. Many contralateral neurons, which projected across brain hemispheres, were in-out hubs and synapsed onto each other, facilitating extensive interhemispheric communication. We also analyzed interactions between the brain and nerve cord. We found that descending neurons targeted a small fraction of premotor elements that could play important roles in switching between locomotor states. A subset of descending neurons targeted low-order post-sensory interneurons likely modulating sensory processing. CONCLUSION The complete brain connectome of the Drosophila larva will be a lasting reference study, providing a basis for a multitude of theoretical and experimental studies of brain function. The approach and computational tools generated in this study will facilitate the analysis of future connectomes. Although the details of brain organization differ across the animal kingdom, many circuit architectures are conserved. As more brain connectomes of other organisms are mapped in the future, comparisons between them will reveal both common and therefore potentially optimal circuit architectures, as well as the idiosyncratic ones that underlie behavioral differences between organisms. Some of the architectural features observed in the Drosophila larval brain, including multilayer shortcuts and prominent nested recurrent loops, are found in state-of-the-art artificial neural networks, where they can compensate for a lack of network depth and support arbitrary, task-dependent computations. Such features could therefore increase the brain’s computational capacity, overcoming physiological constraints on the number of neurons. Future analysis of similarities and differences between brains and artificial neural networks may help in understanding brain computational principles and perhaps inspire new machine learning architectures. The connectome of the Drosophila larval brain. The morphologies of all brain neurons, reconstructed from a synapse-resolution EM volume, and the synaptic connectivity matrix of an entire brain. This connectivity information was used to hierarchically cluster all brains into 93 cell types, which were internally consistent based on morphology and known function. 
    more » « less
  3. A major goal in neuroscience is to understand the relationship between an animal’s behavior and how this is encoded in the brain. Therefore, a typical experiment involves training an animal to perform a task and recording the activity of its neurons – brain cells – while the animal carries out the task. To complement these experimental results, researchers “train” artificial neural networks – simplified mathematical models of the brain that consist of simple neuron-like units – to simulate the same tasks on a computer. Unlike real brains, artificial neural networks provide complete access to the “neural circuits” responsible for a behavior, offering a way to study and manipulate the behavior in the circuit. One open issue about this approach has been the way in which the artificial networks are trained. In a process known as reinforcement learning, animals learn from rewards (such as juice) that they receive when they choose actions that lead to the successful completion of a task. By contrast, the artificial networks are explicitly told the correct action. In addition to differing from how animals learn, this limits the types of behavior that can be studied using artificial neural networks. Recent advances in the field of machine learning that combine reinforcement learning with artificial neural networks have now allowed Song et al. to train artificial networks to perform tasks in a way that mimics the way that animals learn. The networks consisted of two parts: a “decision network” that uses sensory information to select actions that lead to the greatest reward, and a “value network” that predicts how rewarding an action will be. Song et al. found that the resulting artificial “brain activity” closely resembled the activity found in the brains of animals, confirming that this method of training artificial neural networks may be a useful tool for neuroscientists who study the relationship between brains and behavior. The training method explored by Song et al. represents only one step forward in developing artificial neural networks that resemble the real brain. In particular, neural networks modify connections between units in a vastly different way to the methods used by biological brains to alter the connections between neurons. Future work will be needed to bridge this gap. 
    more » « less
  4. Probabilistic circuits (PCs) such as sum-product networks efficiently represent large multi-variate probability distributions. They are preferred in practice over other probabilistic representations, such as Bayesian and Markov networks, because PCs can solve marginal inference (MAR) tasks in time that scales linearly in the size of the network. Unfortunately, the most probable explanation (MPE) task and its generalization, the marginal maximum-a-posteriori (MMAP) inference task remain NP-hard in these models. Inspired by the recent work on using neural networks for generating near-optimal solutions to optimization problems such as integer linear programming, we propose an approach that uses neural networks to approximate MMAP inference in PCs. The key idea in our approach is to approximate the cost of an assignment to the query variables using a continuous multilinear function and then use the latter as a loss function. The two main benefits of our new method are that it is self-supervised, and after the neural network is learned, it requires only linear time to output a solution. We evaluate our new approach on several benchmark datasets and show that it outperforms three competing linear time approximations: max-product inference, max-marginal inference, and sequential estimation, which are used in practice to solve MMAP tasks in PCs.

     
    more » « less
  5. Accurate uncertainty quantification is necessary to enhance the reliability of deep learning (DL) models in realworld applications. In the case of regression tasks, prediction intervals (PIs) should be provided along with the deterministic predictions of DL models. Such PIs are useful or “high-quality (HQ)” as long as they are sufficiently narrow and capture most of the probability density. In this article, we present a method to learn PIs for regression-based neural networks (NNs) automatically in addition to the conventional target predictions. In particular, we train two companion NNs: one that uses one output, the target estimate, and another that uses two outputs, the upper and lower bounds of the corresponding PI. Our main contribution is the design of a novel loss function for the PI-generation network that takes into account the output of the target-estimation network and has two optimization objectives: minimizing the mean PI width and ensuring the PI integrity using constraints that maximize the PI probability coverage implicitly. Furthermore, we introduce a self-adaptive coefficient that balances both objectives within the loss function, which alleviates the task of fine-tuning. Experiments using a synthetic dataset, eight benchmark datasets, and a real-world crop yield prediction dataset showed that our method was able to maintain a nominal probability coverage and produce significantly narrower PIs without detriment to its target estimation accuracy when compared to those PIs generated by three state-of-the-art neuralnetwork-based methods. In other words, our method was shown to produce higher quality PIs. 
    more » « less