skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Memory-Based Computing for Energy-Efficient AI: Grand Challenges
The remarkable progress in artificial intelligence (AI) has ushered in a new era characterized by models with billions of parameters, enabling extraordinary capabilities across diverse domains. However, these achievements come at a significant cost in terms of memory and energy consumption. The growing demand for computational resources raises grand challenges for the sustainable development of energy-efficient AI systems. This paper delves into the paradigm of memory-based computing as a promising avenue to address these challenges. By capitalizing on the inherent characteristics of memory and its efficient utilization, memory-based computing offers a novel approach to enhance AI performance while reducing the associated energy costs. Our paper systematically analyzes the multifaceted aspects of this paradigm, highlighting its potential benefits and outlining the challenges it poses. Through an exploration of various methodologies, architectures, and algorithms, we elucidate the intricate interplay between memory utilization, computational efficiency, and AI model complexity. Furthermore, we review the evolving area of hardware and software solutions for memory-based computing, underscoring their implications for achieving energy-efficient AI systems. As AI continues its rapid evolution, identifying the key challenges and insights presented in this paper serve as a foundational guide for researchers striving to navigate the complex field of memory-based computing and its pivotal role in shaping the future of energy-efficient AI.  more » « less
Award ID(s):
2153440
PAR ID:
10497851
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
IEEE
Date Published:
ISBN:
979-8-3503-2599-7
Page Range / eLocation ID:
1 to 8
Subject(s) / Keyword(s):
Compute-in-memory energy-efficiency deep learning large language models
Format(s):
Medium: X
Location:
Dubai, United Arab Emirates
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract There is a growing demand for low-power, autonomously learning artificial intelligence (AI) systems that can be applied at the edge and rapidly adapt to the specific situation at deployment site. However, current AI models struggle in such scenarios, often requiring extensive fine-tuning, computational resources, and data. In contrast, humans can effortlessly adjust to new tasks by transferring knowledge from related ones. The concept of learning-to-learn (L2L) mimics this process and enables AI models to rapidly adapt with only little computational effort and data. In-memory computing neuromorphic hardware (NMHW) is inspired by the brain’s operating principles and mimics its physical co-location of memory and compute. In this work, we pair L2L with in-memory computing NMHW based on phase-change memory devices to build efficient AI models that can rapidly adapt to new tasks. We demonstrate the versatility of our approach in two scenarios: a convolutional neural network performing image classification and a biologically-inspired spiking neural network generating motor commands for a real robotic arm. Both models rapidly learn with few parameter updates. Deployed on the NMHW, they perform on-par with their software equivalents. Moreover, meta-training of these models can be performed in software with high-precision, alleviating the need for accurate hardware models. 
    more » « less
  2. Traditional computing is based on an engineering approach that imposes logical states and a computational model upon a physical substrate. Physical or material computing, on the other hand, harnesses and exploits the inherent, naturally-occurring proper- ties of a physical substrate to perform a computation. To do so, reservoir computing is often used as a computing paradigm. In this review and position paper, we take stock of where the field currently stands, delineate opportunities and challenges for future research, and outline steps on how to get material reservoir to the next level. The findings are relevant for beyond CMOS and beyond von Neumann architectures, ML, AI, neuromorphic systems, and computing with novel devices and circuits. 
    more » « less
  3. Channel decoders are key computing modules in wired/wireless communication systems. Recently neural network (NN)-based decoders have shown their promising error-correcting performance because of their end-to-end learning capability. However, compared with the traditional approaches, the emerging neural belief propagation (NBP) solution suffers higher storage and computational complexity, limiting its hardware performance. To address this challenge and develop a channel decoder that can achieve high decoding performance and hardware performance simultaneously, in this paper we take a first step towards exploring SRAM-based in-memory computing for efficient NBP channel decoding. We first analyze the unique sparsity pattern in the NBP processing, and then propose an efficient and fully Digital Sparse In-Memory Matrix vector Multiplier (DSPIMM) computing platform. Extensive experiments demonstrate that our proposed DSPIMM achieves significantly higher energy efficiency and throughput than the state-of-the-art counterparts. 
    more » « less
  4. The current state of neuromorphic computing broadly encompasses domain-specific computing architectures designed to accelerate machine learning (ML) and artificial intelligence (AI) algorithms. As is well known, AI/ML algorithms are limited by memory bandwidth. Novel computing architectures are necessary to overcome this limitation. There are several options that are currently under investigation using both mature and emerging memory technologies. For example, mature memory technologies such as high-bandwidth memories (HBMs) are integrated with logic units on the same die to bring memory closer to the computing units. There are also research efforts where in-memory computing architectures have been implemented using DRAMs or flash memory technologies. However, DRAMs suffer from scaling limitations, while flash memory devices suffer from endurance issues. Additionally, in spite of this significant progress, the massive energy consumption needed in neuromorphic processors while meeting the required training and inferencing performance for AI/ML algorithms for future applications needs to be addressed. On the AI/ML algorithm side, there are several pending issues such as life-long learning, explainability, context-based decision making, multimodal association of data, adaptation to address personalized responses, and resiliency. These unresolved challenges in AI/ML have led researchers to explore brain-inspired computing architectures and paradigms. 
    more » « less
  5. Resistive Random Access Memory (RRAM) devices hold promise as a key enabler technology for energy-efficient, in-memory, and brain-inspired computing paradigms, with the potential to significantly enhance high-performance computing applications. However, the widespread adoption of RRAM technology in high-performance computing applications is hindered by non-ideal device metrics and various reliability challenges. RRAM devices are reported to exhibit critical device-to-device (D2D) and cycle-to-cycle (C2C) variability. In this paper, we investigate D2D and C2C variabilities of Tantalum Oxide RRAM devices and explore potentiation, depression, and endurance dynamics under varying operation conditions. Our ultimate goal is to address performance and reliability issues associated with the oxide-based RRAM device technology and facilitate its broader implementation in future computing applications. 
    more » « less