skip to main content


Title: Evenly Cascaded Convolutional Networks
We introduce Evenly Cascaded convolutional Network (ECN), a neural network taking inspiration from the cascade algorithm of wavelet analysis. ECN employs two feature streams - a low-level and high-level steam. At each layer these streams interact, such that low-level features are modulated using advanced perspectives from the high-level stream. ECN is evenly structured through resizing feature map dimensions by a consistent ratio, which removes the burden of ad-hoc specification of feature map dimensions. ECN produces easily interpretable features maps, a result whose intuition can be understood in the context of scale-space theory. We demonstrate that ECN's design facilitates the training process through providing easily trainable shortcuts. We report new state-of-the-art results for small networks, without the need for additional treatment such as pruning or compression - a consequence of ECN's simple structure and direct training. A 6-layered ECN design with under 500k parameters achieves 95.24% and 78.99% accuracy on CIFAR-10 and CIFAR-100 datasets, respectively, outperforming the current state-of-the-art on small parameter networks, and a 3 million parameter ECN produces results competitive to the state-of-the-art.  more » « less
Award ID(s):
1824198
NSF-PAR ID:
10108142
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
2018 IEEE International Conference on Big Data
Volume:
10-13 Dec. 2018
Page Range / eLocation ID:
4640 to 4647
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. INTRODUCTION A brainwide, synaptic-resolution connectivity map—a connectome—is essential for understanding how the brain generates behavior. However because of technological constraints imaging entire brains with electron microscopy (EM) and reconstructing circuits from such datasets has been challenging. To date, complete connectomes have been mapped for only three organisms, each with several hundred brain neurons: the nematode C. elegans , the larva of the sea squirt Ciona intestinalis , and of the marine annelid Platynereis dumerilii . Synapse-resolution circuit diagrams of larger brains, such as insects, fish, and mammals, have been approached by considering select subregions in isolation. However, neural computations span spatially dispersed but interconnected brain regions, and understanding any one computation requires the complete brain connectome with all its inputs and outputs. RATIONALE We therefore generated a connectome of an entire brain of a small insect, the larva of the fruit fly, Drosophila melanogaster. This animal displays a rich behavioral repertoire, including learning, value computation, and action selection, and shares homologous brain structures with adult Drosophila and larger insects. Powerful genetic tools are available for selective manipulation or recording of individual neuron types. In this tractable model system, hypotheses about the functional roles of specific neurons and circuit motifs revealed by the connectome can therefore be readily tested. RESULTS The complete synaptic-resolution connectome of the Drosophila larval brain comprises 3016 neurons and 548,000 synapses. We performed a detailed analysis of the brain circuit architecture, including connection and neuron types, network hubs, and circuit motifs. Most of the brain’s in-out hubs (73%) were postsynaptic to the learning center or presynaptic to the dopaminergic neurons that drive learning. We used graph spectral embedding to hierarchically cluster neurons based on synaptic connectivity into 93 neuron types, which were internally consistent based on other features, such as morphology and function. We developed an algorithm to track brainwide signal propagation across polysynaptic pathways and analyzed feedforward (from sensory to output) and feedback pathways, multisensory integration, and cross-hemisphere interactions. We found extensive multisensory integration throughout the brain and multiple interconnected pathways of varying depths from sensory neurons to output neurons forming a distributed processing network. The brain had a highly recurrent architecture, with 41% of neurons receiving long-range recurrent input. However, recurrence was not evenly distributed and was especially high in areas implicated in learning and action selection. Dopaminergic neurons that drive learning are amongst the most recurrent neurons in the brain. Many contralateral neurons, which projected across brain hemispheres, were in-out hubs and synapsed onto each other, facilitating extensive interhemispheric communication. We also analyzed interactions between the brain and nerve cord. We found that descending neurons targeted a small fraction of premotor elements that could play important roles in switching between locomotor states. A subset of descending neurons targeted low-order post-sensory interneurons likely modulating sensory processing. CONCLUSION The complete brain connectome of the Drosophila larva will be a lasting reference study, providing a basis for a multitude of theoretical and experimental studies of brain function. The approach and computational tools generated in this study will facilitate the analysis of future connectomes. Although the details of brain organization differ across the animal kingdom, many circuit architectures are conserved. As more brain connectomes of other organisms are mapped in the future, comparisons between them will reveal both common and therefore potentially optimal circuit architectures, as well as the idiosyncratic ones that underlie behavioral differences between organisms. Some of the architectural features observed in the Drosophila larval brain, including multilayer shortcuts and prominent nested recurrent loops, are found in state-of-the-art artificial neural networks, where they can compensate for a lack of network depth and support arbitrary, task-dependent computations. Such features could therefore increase the brain’s computational capacity, overcoming physiological constraints on the number of neurons. Future analysis of similarities and differences between brains and artificial neural networks may help in understanding brain computational principles and perhaps inspire new machine learning architectures. The connectome of the Drosophila larval brain. The morphologies of all brain neurons, reconstructed from a synapse-resolution EM volume, and the synaptic connectivity matrix of an entire brain. This connectivity information was used to hierarchically cluster all brains into 93 cell types, which were internally consistent based on morphology and known function. 
    more » « less
  2. Recently, the region proposal networks (RPN) have been combined with the Siamese network for tracking, and shown excellent accuracy with high efficiency. Nevertheless, previously proposed one-stage Siamese-RPN trackers degenerate in presence of similar distractors and large scale variation. Addressing these issues, we propose a multi-stage tracking framework, Siamese Cascaded RPN (C-RPN), which consists of a sequence of RPNs cascaded from deep high-level to shallow low-level layers in a Siamese network. Compared to previous solutions, C-RPN has several advantages: (1) Each RPN is trained using the outputs of RPN in the previous stage. Such process stimulates hard negative sampling, resulting in more balanced training samples. Consequently, the RPNs are sequentially more discriminative in distinguishing difficult background (i.e., similar distractors). (2) Multi-level features are fully leveraged through a novel feature transfer block (FTB) for each RPN, further improving the discriminability of C-RPN using both high-level semantic and low-level spatial information. (3) With multiple steps of regressions, C-RPN progressively refines the location and shape of the target in each RPN with adjusted anchor boxes in the previous stage, which makes localization more accurate. C-RPN is trained end-to-end with the multi-task loss function. In inference, C-RPN is deployed as it is, without any temporal adaption, for real-time tracking. In extensive experiments on OTB-2013, OTB-2015, VOT- 2016, VOT-2017, LaSOT and TrackingNet, C-RPN consistently achieves state-of-the-art results and runs in real-time. 
    more » « less
  3. RRAM-based in-memory computing (IMC) effectively accelerates deep neural networks (DNNs) and other machine learning algorithms. On the other hand, in the presence of RRAM device variations and lower precision, the mapping of DNNs to RRAM-based IMC suffers from severe accuracy loss. In this work, we propose a novel hybrid IMC architecture that integrates an RRAM-based IMC macro with a digital SRAM macro using a programmable shifter to compensate for the RRAM variations and recover the accuracy. The digital SRAM macro consists of a small SRAM memory array and an array of multiply-and-accumulate (MAC) units. The non-ideal output from the RRAM macro, due to device and circuit non-idealities, is compensated by adding the precise output from the SRAM macro. In addition, the programmable shifter allows for different scales of compensation by shifting the SRAM macro output relative to the RRAM macro output. On the algorithm side, we develop a framework for the training of DNNs to support the hybrid IMC architecture through ensemble learning. The proposed framework performs quantization (weights and activations), pruning, RRAM IMC-aware training, and employs ensemble learning through different compensation scales by utilizing the programmable shifter. Finally, we design a silicon prototype of the proposed hybrid IMC architecture in the 65nm SUNY process to demonstrate its efficacy. Experimental evaluation of the hybrid IMC architecture shows that the SRAM compensation allows for a realistic IMC architecture with multi-level RRAM cells (MLC) even though they suffer from high variations. The hybrid IMC architecture achieves up to 21.9%, 12.65%, and 6.52% improvement in post-mapping accuracy over state-of-the-art techniques, at minimal overhead, for ResNet-20 on CIFAR-10, VGG-16 on CIFAR-10, and ResNet-18 on ImageNet, respectively. 
    more » « less
  4. null (Ed.)
    The state-of-the-art of fully-supervised methods for temporal action localization from untrimmed videos has achieved impressive results. Yet, it remains unsatisfactory for the weakly-supervised temporal action localization, where only video-level action labels are given without the timestamp annotation on when the actions occur. The main reason comes from that, the weakly-supervised networks only focus on the highly discriminative frames, but there are some ambiguous frames in both background and action classes. The ambiguous frames in background class are very similar to the real actions, which may be treated as target actions and result in false positives. On the other hand, the ambiguous frames in action class which possibly contain action instances, are prone to be false negatives by the weakly-supervised networks and result in a coarse localization. To solve these problems, we introduce a novel weakly-supervised Action Completeness Modeling with Back- ground Aware Networks (ACM-BANets). Our Background Aware Network (BANet) contains a weight-sharing two-branch architecture, with an action guided Background aware Temporal Attention Module (B-TAM) and an asymmetrical training strategy, to suppress both highly discriminative and ambiguous background frames to remove the false positives. Our action completeness modeling contains multiple BANets, and the BANets are forced to discover different but complementary action instances to completely localize the action instances in both highly discriminative and ambiguous action frames. In the 𝑖-th iteration, the 𝑖-th BANet discovers the discriminative features, which are then erased from the feature map. The partially-erased feature map is fed into the (i+1)-th BANet of the next iteration to force this BANet to discover discriminative features different from the 𝑖-th BANet. Evaluated on two challenging untrimmed video datasets, THUMOS14 and ActivityNet1.3, our approach outperforms all the current weakly-supervised methods for temporal action localization. 
    more » « less
  5. In many applications, data is easy to acquire but expensive and time-consuming to label, prominent examples include medical imaging and NLP. This disparity has only grown in recent years as our ability to collect data improves. Under these constraints, it makes sense to select only the most informative instances from the unlabeled pool and request an oracle (e.g., a human expert) to provide labels for those samples. The goal of active learning is to infer the informativeness of unlabeled samples so as to minimize the number of requests to the oracle. Here, we formulate active learning as an open-set recognition problem. In this paradigm, only some of the inputs belong to known classes; the classifier must identify the rest as unknown . More specifically, we leverage variational neural networks (VNNs), which produce high-confidence (i.e., low-entropy) predictions only for inputs that closely resemble the training data. We use the inverse of this confidence measure to select the samples that the oracle should label. Intuitively, unlabeled samples that the VNN is uncertain about contain features that the network has not been exposed to; thus they are more informative for future training. We carried out an extensive evaluation of our novel, probabilistic formulation of active learning, achieving state-of-the-art results on MNIST, CIFAR-10, CIFAR-100, and FashionMNIST. Additionally, unlike current active learning methods, our algorithm can learn even in the presence of out-of-distribution outliers. As our experiments show, when the unlabeled pool consists of a mixture of samples from multiple datasets, our approach can automatically distinguish between samples from seen vs. unseen datasets. Overall, our results show that high-quality uncertainty measures are key for pool-based active learning. 
    more » « less