Due to the separate memory and computation units in traditional Von-Neumann architecture, massive data transfer dominates the overall computing system’s power and latency, known as the ‘Memory-Wall’ issue. Especially with ever-increasing deep learning-based AI model size and computing complexity, it becomes the bottleneck for state-of-the-art AI computing systems. To address this challenge, In-Memory Computing (IMC) based Neural Network accelerators have been widely investigated to support AI computing within memory. However, most of those works focus only on inference. The on-device training and continual learning have not been well explored yet. In this work, for the first time, we introduce on-device continual learning with STT-assisted-SOT (SAS) Magnetic Random Access Memory (MRAM) based IMC system. On the hardware side, we have fabricated a SAS-MRAM device prototype with 4 Magnetic Tunnel Junctions (MTJ, each at 100nm × 50nm) sharing a common heavy metal layer, achieving significantly improved memory writing and area efficiency compared to traditional SOT-MRAM. Next, we designed fully digital IMC circuits with our SAS-MRAM to support both neural network inference and on-device learning. To enable efficient on-device continual learning for new task data, we present an 8-bit integer (INT8) based continual learning algorithm that utilizes our SAS-MRAM IMC-supported bit-serial digital in-memory convolution operations to train a small parallel reprogramming Network (Rep-Net) while freezing the major backbone model. Extensive studies have been presented based on our fabricated SAS-MRAM device prototype, cross-layer device-circuit benchmarking and simulation, as well as the on-device continual learning system evaluation.
more »
« less
This content will become publicly available on December 1, 2026
Rapid learning with phase-change memory-based in-memory computing through learning-to-learn
Abstract There is a growing demand for low-power, autonomously learning artificial intelligence (AI) systems that can be applied at the edge and rapidly adapt to the specific situation at deployment site. However, current AI models struggle in such scenarios, often requiring extensive fine-tuning, computational resources, and data. In contrast, humans can effortlessly adjust to new tasks by transferring knowledge from related ones. The concept of learning-to-learn (L2L) mimics this process and enables AI models to rapidly adapt with only little computational effort and data. In-memory computing neuromorphic hardware (NMHW) is inspired by the brain’s operating principles and mimics its physical co-location of memory and compute. In this work, we pair L2L with in-memory computing NMHW based on phase-change memory devices to build efficient AI models that can rapidly adapt to new tasks. We demonstrate the versatility of our approach in two scenarios: a convolutional neural network performing image classification and a biologically-inspired spiking neural network generating motor commands for a real robotic arm. Both models rapidly learn with few parameter updates. Deployed on the NMHW, they perform on-par with their software equivalents. Moreover, meta-training of these models can be performed in software with high-precision, alleviating the need for accurate hardware models.
more »
« less
- Award ID(s):
- 2318152
- PAR ID:
- 10570628
- Publisher / Repository:
- Nature Portfolio
- Date Published:
- Journal Name:
- Nature Communications
- Volume:
- 16
- Issue:
- 1
- ISSN:
- 2041-1723
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The remarkable progress in artificial intelligence (AI) has ushered in a new era characterized by models with billions of parameters, enabling extraordinary capabilities across diverse domains. However, these achievements come at a significant cost in terms of memory and energy consumption. The growing demand for computational resources raises grand challenges for the sustainable development of energy-efficient AI systems. This paper delves into the paradigm of memory-based computing as a promising avenue to address these challenges. By capitalizing on the inherent characteristics of memory and its efficient utilization, memory-based computing offers a novel approach to enhance AI performance while reducing the associated energy costs. Our paper systematically analyzes the multifaceted aspects of this paradigm, highlighting its potential benefits and outlining the challenges it poses. Through an exploration of various methodologies, architectures, and algorithms, we elucidate the intricate interplay between memory utilization, computational efficiency, and AI model complexity. Furthermore, we review the evolving area of hardware and software solutions for memory-based computing, underscoring their implications for achieving energy-efficient AI systems. As AI continues its rapid evolution, identifying the key challenges and insights presented in this paper serve as a foundational guide for researchers striving to navigate the complex field of memory-based computing and its pivotal role in shaping the future of energy-efficient AI.more » « less
-
In this paper, we propose a novel control architecture, inspired from neuroscience, for adaptive control of continuous time systems. The objective here is to design control architectures and algorithms that can learn and adapt quickly to changes that are even abrupt. The proposed architecture, in the setting of standard neural network (NN) based adaptive control, augments an external working memory to the NN. The learning system stores, in its external working memory, recently observed feature vectors from the hidden layer of the NN that are relevant and forgets the older irrelevant values. It retrieves relevant vectors from the working memory to modify the final control signal generated by the controller. The use of external working memory improves the context inducing the learning system to search in a particular direction. This directed learning allows the learning system to find a good approximation of the unknown function even after abrupt changes quickly. We consider two classes of controllers for illustration of our ideas (i) a model reference NN adaptive controller for linear systems with matched uncertainty (ii) backstepping NN controller for strict feedback systems. Through extensive simulations and specific metrics we show that memory augmentation improves learning significantly even when the system undergoes sudden changes. Importantly, we also provide evidence for the proposed mechanism by which this specific memory augmentation improves learning.more » « less
-
Channel decoders are key computing modules in wired/wireless communication systems. Recently neural network (NN)-based decoders have shown their promising error-correcting performance because of their end-to-end learning capability. However, compared with the traditional approaches, the emerging neural belief propagation (NBP) solution suffers higher storage and computational complexity, limiting its hardware performance. To address this challenge and develop a channel decoder that can achieve high decoding performance and hardware performance simultaneously, in this paper we take a first step towards exploring SRAM-based in-memory computing for efficient NBP channel decoding. We first analyze the unique sparsity pattern in the NBP processing, and then propose an efficient and fully Digital Sparse In-Memory Matrix vector Multiplier (DSPIMM) computing platform. Extensive experiments demonstrate that our proposed DSPIMM achieves significantly higher energy efficiency and throughput than the state-of-the-art counterparts.more » « less
-
Abstract Realizing increasingly complex artificial intelligence (AI) functionalities directly on edge devices calls for unprecedented energy efficiency of edge hardware. Compute-in-memory (CIM) based on resistive random-access memory (RRAM) 1 promises to meet such demand by storing AI model weights in dense, analogue and non-volatile RRAM devices, and by performing AI computation directly within RRAM, thus eliminating power-hungry data movement between separate compute and memory 2–5 . Although recent studies have demonstrated in-memory matrix-vector multiplication on fully integrated RRAM-CIM hardware 6–17 , it remains a goal for a RRAM-CIM chip to simultaneously deliver high energy efficiency, versatility to support diverse models and software-comparable accuracy. Although efficiency, versatility and accuracy are all indispensable for broad adoption of the technology, the inter-related trade-offs among them cannot be addressed by isolated improvements on any single abstraction level of the design. Here, by co-optimizing across all hierarchies of the design from algorithms and architecture to circuits and devices, we present NeuRRAM—a RRAM-based CIM chip that simultaneously delivers versatility in reconfiguring CIM cores for diverse model architectures, energy efficiency that is two-times better than previous state-of-the-art RRAM-CIM chips across various computational bit-precisions, and inference accuracy comparable to software models quantized to four-bit weights across various AI tasks, including accuracy of 99.0 percent on MNIST 18 and 85.7 percent on CIFAR-10 19 image classification, 84.7-percent accuracy on Google speech command recognition 20 , and a 70-percent reduction in image-reconstruction error on a Bayesian image-recovery task.more » « less
An official website of the United States government
