skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Neuromorphic Swarm on RRAM Compute-in-Memory Processor for Solving QUBO Problem
Combinatorial optimization problems prevail in engineering and industry. Some are NP-hard and thus become difficult to solve on edge devices due to limited power and computing resources. Quadratic Unconstrained Binary Optimization (QUBO) problem is a valuable emerging model that can formulate numerous combinatorial problems, such as Max-Cut, traveling salesman problems, and graphic coloring. QUBO model also reconciles with two emerging computation models, quantum computing and neuromorphic computing, which can potentially boost the speed and energy efficiency in solving combinatorial problems. In this work, we design a neuromorphic QUBO solver composed of a swarm of spiking neural networks (SNN) that conduct a population-based meta-heuristic search for solutions. The proposed model can achieve about x20 40 speedup on large QUBO problems in terms of time steps compared to a traditional neural network solver. As a codesign, we evaluate the neuromorphic swarm solver on a 40nm 25mW Resistive RAM (RRAM) Compute-in-Memory (CIM) SoC with a 2.25MB RRAM-based accelerator and an embedded Cortex M3 core. The collaborative SNN swarm can fully exploit the specialty of CIM accelerator in matrix and vector multiplications. Compared to previous works, such an algorithm-hardware synergized solver exhibits advantageous speed and energy efficiency for edge devices.  more » « less
Award ID(s):
2153440
PAR ID:
10497848
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
IEEE
Date Published:
ISBN:
979-8-3503-2348-1
Page Range / eLocation ID:
1 to 6
Subject(s) / Keyword(s):
neuromorphic computing spiking neural networks combinatorial optimization QUBO Resistive RAM Compute-in-Memory
Format(s):
Medium: X
Location:
San Francisco, CA, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. Quadratic Unconstrained Binary Optimization (QUBO) problem becomes an attractive and valuable optimization problem formulation in that it can easily transform into a variety of other combinatorial optimization problems such as Graph/number Partition, Max-Cut, SAT, Vertex Coloring, TSP, etc. Some of these problems are NP-hard and widely applied in industry and scientific research. Meanwhile, QUBO has been discovered to be compatible with two emerging computing paradigms, neuromorphic computing, and quantum computing, with tremendous potential to speed up future optimization solvers. In this paper, we propose a novel neuromorphic computing paradigm that employs multiple collaborative spiking neural networks to solve QUBO problems. Each SNN conducts a local stochastic gradient descent search and shares the global best solutions periodically to perform a meta-heuristic search for optima. We simulate our model and compare it to a single SNN solver and a mult-SNN solver without collaboration. Through tests on benchmark problems, the proposed method is demonstrated to be more efficient and effective in searching for QUBO optima. Specifically, it exhibits x10 and x15-20 speedup respectively on the multi-SNN solver without collaboration and the single-SNN solver. 
    more » « less
  2. Abstract Realizing increasingly complex artificial intelligence (AI) functionalities directly on edge devices calls for unprecedented energy efficiency of edge hardware. Compute-in-memory (CIM) based on resistive random-access memory (RRAM) 1 promises to meet such demand by storing AI model weights in dense, analogue and non-volatile RRAM devices, and by performing AI computation directly within RRAM, thus eliminating power-hungry data movement between separate compute and memory 2–5 . Although recent studies have demonstrated in-memory matrix-vector multiplication on fully integrated RRAM-CIM hardware 6–17 , it remains a goal for a RRAM-CIM chip to simultaneously deliver high energy efficiency, versatility to support diverse models and software-comparable accuracy. Although efficiency, versatility and accuracy are all indispensable for broad adoption of the technology, the inter-related trade-offs among them cannot be addressed by isolated improvements on any single abstraction level of the design. Here, by co-optimizing across all hierarchies of the design from algorithms and architecture to circuits and devices, we present NeuRRAM—a RRAM-based CIM chip that simultaneously delivers versatility in reconfiguring CIM cores for diverse model architectures, energy efficiency that is two-times better than previous state-of-the-art RRAM-CIM chips across various computational bit-precisions, and inference accuracy comparable to software models quantized to four-bit weights across various AI tasks, including accuracy of 99.0 percent on MNIST 18 and 85.7 percent on CIFAR-10 19 image classification, 84.7-percent accuracy on Google speech command recognition 20 , and a 70-percent reduction in image-reconstruction error on a Bayesian image-recovery task. 
    more » « less
  3. Spiking neural networks (SNNs) offer a promising biologically-plausible computing model and lend themselves to ultra-low-power event-driven processing on neuromorphic processors. Compared with the conventional artificial neural networks, SNNs are well-suited for processing complex spatiotemporal data. Despite its significance, dataflow optimization of spiking neural accelerator architectures has not been extensively studied. Recognizing the need for efficient processing of complex spatiotemporal data while considering the all-or-none nature of spiking activities, we propose holistic reconfigurable dataflow optimization for systolic array acceleration of spiking convolutional networks (S-CNNs). A novel scheme is introduced for parallel acceleration of computation across multiple time points, which further allows for systemic optimization of variable tiling for a large performance and efficiency gains. We show how variable tiling, in particular, the positioning of the temporal dimension, can be targeted to optimize data movement, throughput, and energy efficiency. Furthermore, we explore joint layer-dependent dataflow and accelerator hardware optimization to further boost performance and energy efficiency. To support systemic design space exploration, we develop an SNN dataflow simulator capable of analyzing the throughput and energy dissipation of systolic array accelerators for any targeted S-CNN while considering the inherent spatiotemporal characteristics of spiking neural computation. The proposed techniques deliver orders of magnitude of improvements on throughput, energy efficiency, and delay-energy product for accelerating deep Alexnet and VGG-16 SNNs. 
    more » « less
  4. Driven by the expanse of Internet of Things (IoT) and Cyber-Physical Systems (CPS), there is an increasing demand to process streams of temporal data on embedded devices with limited energy and power resources. Among all potential solutions, neuromorphic computing with spiking neural networks (SNN) that mimic the behavior of brain, have recently been placed at the forefront. Encoding information into sparse and distributed spike events enables low-power implementations, and the complex spatial temporal dynamics of synapses and neurons enable SNNs to detect temporal pattern. However, most existing hardware SNN implementations use simplified neuron and synapse models ignoring synapse dynamic, which is critical for temporal pattern detection and other applications that require temporal dynamics. To adopt a more realistic synapse model in neuromorphic platform its significant computation overhead must be addressed. In this work, we propose an FPGA-based SNN with biologically realistic neuron and synapse for temporal information processing. An encoding scheme to convert continuous real-valued information into sparse spike events is presented. The event-driven implementation of synapse dynamic model and its hardware design that is optimized to exploit the sparsity are also presented. Finally, we train the SNN on various temporal pattern-learning tasks and evaluate its performance and efficiency as compared to rate-based models and artificial neural networks on different embedded platforms. Experiments show that our work can achieve 10X speed up and 196X gains in energy efficiency compared with GPU. 
    more » « less
  5. The spatiotemporal nature of neuronal behavior in spiking neural networks (SNNs) makes SNNs promising for edge applications that require high energy efficiency. To realize SNNs in hardware, spintronic neuron implementations can bring advantages of scalability and energy efficiency. Domain wall (DW)-based magnetic tunnel junction (MTJ) devices are well suited for probabilistic neural networks given their intrinsic integrate-and-fire behavior with tunable stochasticity. Here, we present a scaled DW-MTJ neuron with voltage-dependent firing probability. The measured behavior was used to simulate a SNN that attains accuracy during learning compared to an equivalent, but more complicated, multi-weight DW-MTJ device. The validation accuracy during training was also shown to be comparable to an ideal leaky integrate and fire device. However, during inference, the binary DW-MTJ neuron outperformed the other devices after Gaussian noise was introduced to the Fashion-MNIST classification task. This work shows that DW-MTJ devices can be used to construct noise-resilient networks suitable for neuromorphic computing on the edge. 
    more » « less