skip to main content


Title: High-speed CMOS-free purely spintronic asynchronous recurrent neural network
The exceptional capabilities of the human brain provide inspiration for artificially intelligent hardware that mimics both the function and the structure of neurobiology. In particular, the recent development of nanodevices with biomimetic characteristics promises to enable the development of neuromorphic architectures with exceptional computational efficiency. In this work, we propose biomimetic neurons comprised of domain wall-magnetic tunnel junctions that can be integrated into the first trainable CMOS-free recurrent neural network with biomimetic components. This paper demonstrates the computational effectiveness of this system for benchmark tasks and its superior computational efficiency relative to alternative approaches for recurrent neural networks.  more » « less
Award ID(s):
1910800
NSF-PAR ID:
10465770
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
APL Machine Learning
Volume:
1
Issue:
1
ISSN:
2770-9019
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Recurrent neural networks (RNNs) are often used to model circuits in the brain and can solve a variety of difficult computational problems requiring memory, error correction, or selection (Hopfield, 1982; Maass et al., 2002; Maass, 2011). However, fully connected RNNs contrast structurally with their biological counterparts, which are extremely sparse (about 0.1%). Motivated by the neocortex, where neural connectivity is constrained by physical distance along cortical sheets and other synaptic wiring costs, we introduce locality masked RNNs (LM-RNNs) that use task-agnostic predetermined graphs with sparsity as low as 4%. We study LM-RNNs in a multitask learning setting relevant to cognitive systems neuroscience with a commonly used set of tasks, 20-Cog-tasks (Yang et al., 2019). We show through reductio ad absurdum that 20-Cog-tasks can be solved by a small pool of separated autapses that we can mechanistically analyze and understand. Thus, these tasks fall short of the goal of inducing complex recurrent dynamics and modular structure in RNNs. We next contribute a new cognitive multitask battery, Mod-Cog, consisting of up to 132 tasks that expands by about seven-fold the number of tasks and task complexity of 20-Cog-tasks. Importantly, while autapses can solve the simple 20-Cog-tasks, the expanded task set requires richer neural architectures and continuous attractor dynamics. On these tasks, we show that LM-RNNs with an optimal sparsity result in faster training and better data efficiency than fully connected networks.

     
    more » « less
  2. null (Ed.)
    Channel state information (CSI) plays a vital role in scheduling and capacity-approaching transmission optimization of massive MIMO communication systems. In frequency division duplex (FDD) MIMO systems, forward link CSI reconstruction at transmitter relies on CSI feedback from receiving nodes and must carefully weigh the tradeoff between reconstruction accuracy and feedback bandwidth. Recent application of recurrent neural networks (RNN) has demonstrated promising results of massive MIMO CSI feedback compression. However, the cost of computation and memory associated with RNN deep learning remains high. In this work, we exploit channel temporal coherence to improve learning accuracy and feedback efficiency. Leveraging a Markovian model, we develop a deep convolutional neural network (CNN)-based framework called MarkovNet to efficiently encode CSI feedback to improve accuracy and efficiency. We explore important physical insights including spherical normalization of input data and deep learning network optimizations in feedback compression. We demonstrate that MarkovNet provides a substantial performance improvement and computational complexity reduction over the RNN-based work.We demonstrate MarkovNet’s performance under different MIMO configurations and for a range of feedback intervals and rates. CSI recovery with MarkovNet outperforms RNN-based CSI estimation with only a fraction of computational cost. 
    more » « less
  3. Abstract

    Accurate and timely storm surge forecasts are essential during tropical cyclone events in order to assess the magnitude and location of the impacts. Coupled ocean‐atmosphere dynamical models provide accurate measures of storm surge but remain too computationally expensive to run for real‐time forecasting purposes. Therefore, it is common to utilize a parametric vortex model, implemented within a hydrodynamic model, which decreases computational time at the expense of forecast accuracy. Recently, data‐driven neural networks are being implemented as an alternative due to their combined efficiency and high accuracy. This work seeks to examine how an artificial neural network (ANN) can be used to make accurate storm surge predictions, and explores the added value of using a recurrent neural network (RNN). In particular, it is concerned with determining the parameters needed to successfully implement a neural network model for the Mid‐Atlantic Bight region. The neural network models were trained with modeled data resulting from coupling of the Hybrid Weather Research and Forecasting cyclone model (HWCM) and the Advanced Circulation Model. An ensemble of synthetic, but physically plausible, cyclones were simulated using the HWCM and used as input for the hydrodynamic model. Tests of the ANN were conducted to investigate the optimal lead‐time configuration of the input data and the neural network architecture needed to minimize storm surge forecast errors. Results highlight the accuracy of the ANN in forecasting moderate storm surge levels, while indicating a deficiency in capturing the magnitude of the peak values, which is improved in the implementation of the RNN.

     
    more » « less
  4. Recurrent neural networks (RNNs), temporal convolutions, and neural differential equations (NDEs) are popular families of deep learning models for time-series data, each with unique strengths and tradeoffs in modeling power and computational efficiency. We introduce a simple sequence model inspired by control systems that generalizes these approaches while addressing their shortcomings. The Linear State-Space Layer (LSSL) maps a sequence u↦y by simply simulating a linear continuous-time state-space representation ˙x=Ax+Bu,y=Cx+Du. Theoretically, we show that LSSL models are closely related to the three aforementioned families of models and inherit their strengths. For example, they generalize convolutions to continuous-time, explain common RNN heuristics, and share features of NDEs such as time-scale adaptation. We then incorporate and generalize recent theory on continuous-time memorization to introduce a trainable subset of structured matrices A that endow LSSLs with long-range memory. Empirically, stacking LSSL layers into a simple deep neural network obtains state-of-the-art results across time series benchmarks for long dependencies in sequential image classification, real-world healthcare regression tasks, and speech. On a difficult speech classification task with length-16000 sequences, LSSL outperforms prior approaches by 24 accuracy points, and even outperforms baselines that use hand-crafted features on 100x shorter sequences. 
    more » « less
  5. Abstract

    We build on the existing biased competition view to argue that attention is anemergentproperty of neural computations within and across hierarchically embedded and structurally connected cortical pathways. Critically then, one must ask,what is attention emergent from? Within this framework, developmental changes in the quality of sensory input and feedforward‐feedback information flow shape the emergence and efficiency of attention. Several gradients of developing structural and functional cortical architecture across the caudal‐to‐rostral axis provide the substrate for attention to emerge. Neural activity within visual areas depends on neuronal density, receptive field size, tuning properties of neurons, and the location of and competition between features and objects in the visual field. These visual cortical properties highlight the information processing bottleneck attention needs to resolve. Recurrent feedforward and feedback connections convey sensory information through a series of steps at each level of the cortical hierarchy, integrating sensory information across the entire extent of the cortical hierarchy and linking sensory processing to higher‐order brain regions. Higher‐order regions concurrently provide input conveying behavioral context and goals. Thus, attention reflects the output of a series of complex biased competition neural computations that occur within and across hierarchically embedded cortical regions. Cortical development proceeds along the caudal‐to‐rostral axis, mirroring the flow in sensory information from caudal to rostral regions, and visual processing continues to develop into childhood. Examining both typical and atypical development will offer critical mechanistic insight not otherwise available in the adult stable state.

    This article is categorized under:

    Psychology > Attention

     
    more » « less