skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: MnM: A Fast and Efficient Min/Max Searching in MRAM
In-Memory Computing (IMC) technology has been considered to be a promising approach to solve well-known memory-wall challenge for data intensive applications. In this paper, we are the first to propose MnM, a novel IMC system with innovative architecture/circuit designs for fast and efficient Min/Max searching computation in emerging Spin-Orbit Torque Magnetic Random Access Memory (SOT-MRAM). Our proposed SOT-MRAM based in-memory logic circuits are specially optimized to perform parallel, one-cycle XNOR logic that are heavily used in the Min/Max searching-in-memory algorithm. Our novel in-memory XNOR circuit also has an overhead of just two transistors per row when compared to most prior methodologies which typically use multiple sense amplifiers or complex CMOS logic gates. We also design all other required peripheral circuits for implementing complete Min/Max searching-in-MRAM computation. Our cross-layer comprehensive experiments on Dijkstra's algorithm and other sorting algorithms in real word datasets show that our MnM could achieve significant performance improvement over CPUs, GPUs, and other competing IMC platforms based on RRAM/MRAM/DRAM.  more » « less
Award ID(s):
2003749 2144751
PAR ID:
10348288
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Great Lakes Symposium on VLSI
Page Range / eLocation ID:
39 to 44
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Due to the separate memory and computation units in traditional Von-Neumann architecture, massive data transfer dominates the overall computing system’s power and latency, known as the ‘Memory-Wall’ issue. Especially with ever-increasing deep learning-based AI model size and computing complexity, it becomes the bottleneck for state-of-the-art AI computing systems. To address this challenge, In-Memory Computing (IMC) based Neural Network accelerators have been widely investigated to support AI computing within memory. However, most of those works focus only on inference. The on-device training and continual learning have not been well explored yet. In this work, for the first time, we introduce on-device continual learning with STT-assisted-SOT (SAS) Magnetic Random Access Memory (MRAM) based IMC system. On the hardware side, we have fabricated a SAS-MRAM device prototype with 4 Magnetic Tunnel Junctions (MTJ, each at 100nm × 50nm) sharing a common heavy metal layer, achieving significantly improved memory writing and area efficiency compared to traditional SOT-MRAM. Next, we designed fully digital IMC circuits with our SAS-MRAM to support both neural network inference and on-device learning. To enable efficient on-device continual learning for new task data, we present an 8-bit integer (INT8) based continual learning algorithm that utilizes our SAS-MRAM IMC-supported bit-serial digital in-memory convolution operations to train a small parallel reprogramming Network (Rep-Net) while freezing the major backbone model. Extensive studies have been presented based on our fabricated SAS-MRAM device prototype, cross-layer device-circuit benchmarking and simulation, as well as the on-device continual learning system evaluation. 
    more » « less
  2. Recently, in-DRAM computing is becoming one promising technique to address the notorious ‘memory-wall’ issue for big data processing. In this work, for the first time, we propose a novel ‘Min/Max-in-memory’ algorithm based on iterative XNOR bit-wise comparison, which supports parallel inmemory searching for minimum and maximum of bulk data stored in DRAM as unsigned & signed integers, fixed-point and floating numbers. We then develop a new processing-in-DRAM architecture, called Max-PIM, that supports complete bit-wise Boolean logic and beyond. Differentiating from prior works, Max-PIM is optimized with one-cycle fast XNOR logicin-DRAM operation and in-memory data transpose, which are heavily used and keys to accelerate the proposed Min/Max-in-memory algorithm efficiently. Extensive experiments of utilizing Max-PIM in big data sorting and graph processing applications show that it could speed up ~ 50X and ~ 1000X than GPU and CPU, while only consuming 10% and 1% energy, respectively. Moreover, comparing with recent representative In-DRAM computing platforms, i.e., Ambit [1], DRISA [2], our design could speed up ~ 3X - 10X. 
    more » « less
  3. In this paper, we explore potentials of leveraging spin-based in-memory computing platform as an accelerator for Binary Convolutional Neural Networks (BCNN). Such platform can implement the dominant convolution computation based on presented Spin Orbit Torque Magnetic Random Access Memory (SOT-MRAM) array. The proposed array architecture could simultaneously work as non-volatile memory and a reconfigurable in-memory logic (AND, OR) without add-on logic circuits to memory chip as in conventional logic-in-memory designs. The computed logic output could be also simply read out like a normal MRAM bit-cell using the shared memory peripheral circuits. We employ such intrinsic in-memory computing architecture to efficiently process data within memory to greatly reduce power hungry and omit long distance data communication concerning state-of-the-art BCNN hardware. 
    more » « less
  4. Abstract All-electrical driven magnetization switching attracts much attention in next-generation spintronic memory and logic devices, particularly in magnetic random-access memory (MRAM) based on the spin–orbit torque (SOT), i.e. SOT-MRAM, due to its advantages of low power consumption, fast write/read speed, and improved endurance, etc. For conventional SOT-driven switching of the magnet with perpendicular magnetic anisotropy, an external assisted magnetic field is necessary to break the inversion symmetry of the magnet, which not only induces the additional power consumption but also makes the circuit more complicated. Over the last decade, significant effort has been devoted to field-free magnetization manipulation by using SOT. In this review, we introduce the basic concepts of SOT. After that, we mainly focus on several approaches to realize the field-free deterministic SOT switching of the perpendicular magnet. The mechanisms mainly include mirror symmetry breaking, chiral symmetry breaking, exchange bias, and interlayer exchange coupling. Furthermore, we show the recent progress in the study of SOT with unconventional origin and symmetry. The final section is devoted to the industrial-level approach for potential applications of field-free SOT switching in SOT-MRAM technology. 
    more » « less
  5. In this paper, we propose a novel Spin-Transfer Torque Magnetic Random-Access Memory (STT-MRAM) array design that could simultaneously work as non-volatile memory and implement a reconfigure in-memory logic operation without add-on logic circuits to the memory chip. The computed output could be simply read out like a typical MRAM bit-cell through the modified peripheral circuit. Such intrinsic in-memory computation can be used to process data locally and transfers the “cooked” data to the primary processing unit (i.e. CPU or GPU) for complex computation with high precision requirement. It greatly reduces power-hungry and long distance data communication, and further leads to extreme parallelism within memory. In this work, we further propose an in-memory edge extraction algorithm as a case study to demonstrate the efficiency of in memory preprocessing methodology. The simulation results show that our edge extraction method reduces data communication as much as 8x for grayscale image, thus greatly reducing system energy consumption. Meanwhile, the F-measure result shows only ∼10% degradation compared to conventional edge detection operator, such as Prewitt, Sobel and Roberts. 
    more » « less