skip to main content


Title: IMCS2: Novel Device-to-Architecture Co-Design for Low-Power In-Memory Computing Platform Using Coterminous Spin Switch
Abstract--Spin switch (SS) is a promising spintronic device which exhibits compactness, low power, non-volatility, input-output isolation leveraging giant spin Hall effect, spin transfer torque, and dipolar coupling. In this paper, we propose a novel device-to-architecture co-design for an in-memory computing platform using coterminous SS (IMCS2), which could simultaneously work as non-volatile memory and reconfigurable in-memory logic (AND/NAND, OR/NOR, and XOR/XNOR) without add-on logic circuits to memory chip. The computed logic output could be simply read out like a normal magnetic random access memory bit cell using the shared memory peripheral circuits. Such intrinsic in-memory logic could be used to process data within memory to greatly reduce power-hungry and long distance data communication in the conventional von Neumann computing system. The IMCS2-based in-memory bulk bitwise Boolean vector operation shows ~9x energy saving and ~3x speedup compared with that of DRAM-based in-memory computing platform. We further employ in-memory multiplication to evaluate the performance of the proposed in-memory computing platform for vector-vector multiplication with different vector sizes.  more » « less
Award ID(s):
1740126
NSF-PAR ID:
10059778
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
IEEE Transactions on Magnetics
ISSN:
0018-9464
Page Range / eLocation ID:
1 to 14
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper, we explore potentials of leveraging spin-based in-memory computing platform as an accelerator for Binary Convolutional Neural Networks (BCNN). Such platform can implement the dominant convolution computation based on presented Spin Orbit Torque Magnetic Random Access Memory (SOT-MRAM) array. The proposed array architecture could simultaneously work as non-volatile memory and a reconfigurable in-memory logic (AND, OR) without add-on logic circuits to memory chip as in conventional logic-in-memory designs. The computed logic output could be also simply read out like a normal MRAM bit-cell using the shared memory peripheral circuits. We employ such intrinsic in-memory computing architecture to efficiently process data within memory to greatly reduce power hungry and omit long distance data communication concerning state-of-the-art BCNN hardware. 
    more » « less
  2. Abstract Bayesian networks (BNs) find widespread application in many real-world probabilistic problems including diagnostics, forecasting, computer vision, etc. The basic computing primitive for BNs is a stochastic bit (s-bit) generator that can control the probability of obtaining ‘1’ in a binary bit-stream. While silicon-based complementary metal-oxide-semiconductor (CMOS) technology can be used for hardware implementation of BNs, the lack of inherent stochasticity makes it area and energy inefficient. On the other hand, memristors and spintronic devices offer inherent stochasticity but lack computing ability beyond simple vector matrix multiplication due to their two-terminal nature and rely on extensive CMOS peripherals for BN implementation, which limits area and energy efficiency. Here, we circumvent these challenges by introducing a hardware platform based on 2D memtransistors. First, we experimentally demonstrate a low-power and compact s-bit generator circuit that exploits cycle-to-cycle fluctuation in the post-programmed conductance state of 2D memtransistors. Next, the s-bit generators are monolithically integrated with 2D memtransistor-based logic gates to implement BNs. Our findings highlight the potential for 2D memtransistor-based integrated circuits for non-von Neumann computing applications. 
    more » « less
  3. In this paper we propose a Highly Flexible InMemory (HieIM) computing platform using STT MRAM, which can be leveraged to implement Boolean logic functions without sacrificing memory functionality. It could pre-process data within memory to further reduce power hungry long distance communication between memory and processing units as in Von-Neumann computing system. HieIM can implement all the Boolean logic functions (AND/NAND, OR/NOR, XOR/XNOR) between any two cells in the same memory array, thus overcoming the `operand locality' problem in contemporary in-memory computing platform designs. To investigate the performance of HieIM, we test in-memory bulk bit-wise Boolean logic operations using different vector datasets, which shows ~ 8x energy saving and ~ 5x speedup compared to recent DRAM based in-memory computing platform. We further implement an in-memory data encryption engine design based on HieIM as another case study. With AES algorithm, it shows 51.5% and 68.9% lower energy consumption compared to CMOS-ASIC and CMOL based implementations, respectively. 
    more » « less
  4. In this paper, we propose an energy-efficient reconfigurable platform for in-memory processing based on novel 4-terminal spin Hall effect-driven domain wall motion devices that could be employed as both non-volatile memory cell and in-memory logic unit. The proposed designs lead to unity of memory and logic. The device to system level simulation results show that, with 28% area increase in memory structure, the proposed in-memory processing platform achieves a write energy ~15.6 fJ/bit with 79% reduction compared to that of SOT-MRAM counterpart while keeping the identical 1ns writing speed. In addition, the proposed in-memory logic scheme improves the operating energy by 61.3%, as compared with the recent non-volatile in-memory logic designs. An extensive reliability analysis is also performed over the proposed circuits. We employ Advanced Encryption Standard (AES) algorithm as a case study to elucidate the efficiency of the proposed platform at application level. Simulation results exhibit that the proposed platform can show up to 75.7% and 30.4% lower energy consumption compared to CMOS-ASIC and recent pipelined domain wall (DW) AES implementations, respectively. In addition, the AES Energy-Delay Product (EDP) can show 15.1% and 6.1% improvements compared to the DW-AES and CMOS-ASIC implementations, respectively. 
    more » « less
  5.  
    more » « less