skip to main content


Title: Dynamic Reliability Management in Neuromorphic Computing
Neuromorphic computing systems execute machine learning tasks designed with spiking neural networks. These systems are embracing non-volatile memory to implement high-density and low-energy synaptic storage. Elevated voltages and currents needed to operate non-volatile memories cause aging of CMOS-based transistors in each neuron and synapse circuit in the hardware, drifting the transistor’s parameters from their nominal values. If these circuits are used continuously for too long, the parameter drifts cannot be reversed, resulting in permanent degradation of circuit performance over time, eventually leading to hardware faults. Aggressive device scaling increases power density and temperature, which further accelerates the aging, challenging the reliable operation of neuromorphic systems. Existing reliability-oriented techniques periodically de-stress all neuron and synapse circuits in the hardware at fixed intervals, assuming worst-case operating conditions, without actually tracking their aging at run-time. To de-stress these circuits, normal operation must be interrupted, which introduces latency in spike generation and propagation, impacting the inter-spike interval and hence, performance (e.g., accuracy). We observe that in contrast to long-term aging, which permanently damages the hardware, short-term aging in scaled CMOS transistors is mostly due to bias temperature instability. The latter is heavily workload-dependent and, more importantly, partially reversible. We propose a new architectural technique to mitigate the aging-related reliability problems in neuromorphic systems by designing an intelligent run-time manager (NCRTM), which dynamically de-stresses neuron and synapse circuits in response to the short-term aging in their CMOS transistors during the execution of machine learning workloads, with the objective of meeting a reliability target. NCRTM de-stresses these circuits only when it is absolutely necessary to do so, otherwise reducing the performance impact by scheduling de-stress operations off the critical path. We evaluate NCRTM with state-of-the-art machine learning workloads on a neuromorphic hardware. Our results demonstrate that NCRTM significantly improves the reliability of neuromorphic hardware, with marginal impact on performance.  more » « less
Award ID(s):
1942697
NSF-PAR ID:
10317041
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
ACM Journal on Emerging Technologies in Computing Systems
Volume:
17
Issue:
4
ISSN:
1550-4832
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. An emerging use-case of machine learning (ML) is to train a model on a high-performance system and deploy the trained model on energy-constrained embedded systems. Neuromorphic hardware platforms, which operate on principles of the biological brain, can significantly lower the energy overhead of a machine learning inference task, making these platforms an attractive solution for embedded ML systems. We present a design-technology tradeoff analysis to implement such inference tasks on the processing elements (PEs) of a Non-Volatile Memory (NVM)-based neuromorphic hardware. Through detailed circuit-level simulations at scaled process technology nodes, we show the negative impact of technology scaling on the information-processing latency, which impacts the quality-of-service (QoS) of an embedded ML system. At a finer granularity, the latency inside a PE depends on 1) the delay introduced by parasitic components on its current paths, and 2) the varying delay to sense different resistance states of its NVM cells. Based on these two observations, we make the following three contributions. First, on the technology front, we propose an optimization scheme where the NVM resistance state that takes the longest time to sense is set on current paths having the least delay, and vice versa, reducing the average PE latency, which improves the QoS. Second, on the architecture front, we introduce isolation transistors within each PE to partition it into regions that can be individually power-gated, reducing both latency and energy. Finally, on the system-software front, we propose a mechanism to leverage the proposed technological and architectural enhancements when implementing a machine-learning inference task on neuromorphic PEs of the hardware. Evaluations with a recent neuromorphic hardware architecture show that our proposed design-technology co-optimization approach improves both performance and energy efficiency of machine-learning inference tasks without incurring high cost-per-bit. 
    more » « less
  2. Driven by the expanse of Internet of Things (IoT) and Cyber-Physical Systems (CPS), there is an increasing demand to process streams of temporal data on embedded devices with limited energy and power resources. Among all potential solutions, neuromorphic computing with spiking neural networks (SNN) that mimic the behavior of brain, have recently been placed at the forefront. Encoding information into sparse and distributed spike events enables low-power implementations, and the complex spatial temporal dynamics of synapses and neurons enable SNNs to detect temporal pattern. However, most existing hardware SNN implementations use simplified neuron and synapse models ignoring synapse dynamic, which is critical for temporal pattern detection and other applications that require temporal dynamics. To adopt a more realistic synapse model in neuromorphic platform its significant computation overhead must be addressed. In this work, we propose an FPGA-based SNN with biologically realistic neuron and synapse for temporal information processing. An encoding scheme to convert continuous real-valued information into sparse spike events is presented. The event-driven implementation of synapse dynamic model and its hardware design that is optimized to exploit the sparsity are also presented. Finally, we train the SNN on various temporal pattern-learning tasks and evaluate its performance and efficiency as compared to rate-based models and artificial neural networks on different embedded platforms. Experiments show that our work can achieve 10X speed up and 196X gains in energy efficiency compared with GPU. 
    more » « less
  3. Abstract

    Spiking neural networks exploit spatiotemporal processing, spiking sparsity, and high interneuron bandwidth to maximize the energy efficiency of neuromorphic computing. While conventional silicon-based technology can be used in this context, the resulting neuron-synapse circuits require multiple transistors and complicated layouts that limit integration density. Here, we demonstrate unprecedented electrostatic control of dual-gated Gaussian heterojunction transistors for simplified spiking neuron implementation. These devices employ wafer-scale mixed-dimensional van der Waals heterojunctions consisting of chemical vapor deposited monolayer molybdenum disulfide and solution-processed semiconducting single-walled carbon nanotubes to emulate the spike-generating ion channels in biological neurons. Circuits based on these dual-gated Gaussian devices enable a variety of biological spiking responses including phasic spiking, delayed spiking, and tonic bursting. In addition to neuromorphic computing, the tunable Gaussian response has significant implications for a range of other applications including telecommunications, computer vision, and natural language processing.

     
    more » « less
  4. The explosion of “big data” applications imposes severe challenges of speed and scalability on traditional computer systems. As the performance of traditional Von Neumann machines is greatly hindered by the increasing performance gap between CPU and memory (“known as the memory wall”), neuromorphic computing systems have gained considerable attention. The biology-plausible computing paradigm carries out computing by emulating the charging/discharging process of neuron and synapse potential. The unique spike domain information encoding enables asynchronous event driven computation and communication, and hence has the potential for very high energy efficiency. This survey reviews computing models and hardware platforms of existing neuromorphic computing systems. Neuron and synapse models are first introduced, followed by the discussion on how they will affect hardware design. Case studies of several representative hardware platforms, including their architecture and software ecosystems, are further presented. Lastly we present several future research directions. 
    more » « less
  5. Artificial synaptic devices made from natural biomaterials capable of emulating functions of biological synapses, such as synaptic plasticity and memory functions, are desirable for the construction of brain-inspired neuromorphic computing systems. The metal/dielectric/metal device structure is analogous to the pre-synapse/synaptic cleft/post-synapse structure of the biological neuron, while using natural biomaterials promotes ecologically friendly, sustainable, renewable, and low-cost electronic devices. In this work, artificial synaptic devices made from honey mixed with carbon nanotubes, honey-carbon nanotube (CNT) memristors, were investigated. The devices emulated spike-timing-dependent plasticity, with synaptic weight as high as 500%, and demonstrated a paired-pulse facilitation gain of 800%, which is the largest value ever reported. 206-level long-term potentiation (LTP) and long-term depression (LTD) were demonstrated. A conduction model was applied to explain the filament formation and dissolution in the honey-CNT film, and compared to the LTP/LTD mechanism in biological synapses. In addition, the short-term and long-term memory behaviors were clearly demonstrated by an array of 5 × 5 devices. This study shows that the honey-CNT memristor is a promising artificial synaptic device technology for applications in sustainable neuromorphic computing.

     
    more » « less