skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: IMCS2: Novel Device-to-Architecture Co-Design for Low-Power In-Memory Computing Platform Using Coterminous Spin Switch
Abstract--Spin switch (SS) is a promising spintronic device which exhibits compactness, low power, non-volatility, input-output isolation leveraging giant spin Hall effect, spin transfer torque, and dipolar coupling. In this paper, we propose a novel device-to-architecture co-design for an in-memory computing platform using coterminous SS (IMCS2), which could simultaneously work as non-volatile memory and reconfigurable in-memory logic (AND/NAND, OR/NOR, and XOR/XNOR) without add-on logic circuits to memory chip. The computed logic output could be simply read out like a normal magnetic random access memory bit cell using the shared memory peripheral circuits. Such intrinsic in-memory logic could be used to process data within memory to greatly reduce power-hungry and long distance data communication in the conventional von Neumann computing system. The IMCS2-based in-memory bulk bitwise Boolean vector operation shows ~9x energy saving and ~3x speedup compared with that of DRAM-based in-memory computing platform. We further employ in-memory multiplication to evaluate the performance of the proposed in-memory computing platform for vector-vector multiplication with different vector sizes.  more » « less
Award ID(s):
1740126
PAR ID:
10059778
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
IEEE Transactions on Magnetics
ISSN:
0018-9464
Page Range / eLocation ID:
1 to 14
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper, we explore potentials of leveraging spin-based in-memory computing platform as an accelerator for Binary Convolutional Neural Networks (BCNN). Such platform can implement the dominant convolution computation based on presented Spin Orbit Torque Magnetic Random Access Memory (SOT-MRAM) array. The proposed array architecture could simultaneously work as non-volatile memory and a reconfigurable in-memory logic (AND, OR) without add-on logic circuits to memory chip as in conventional logic-in-memory designs. The computed logic output could be also simply read out like a normal MRAM bit-cell using the shared memory peripheral circuits. We employ such intrinsic in-memory computing architecture to efficiently process data within memory to greatly reduce power hungry and omit long distance data communication concerning state-of-the-art BCNN hardware. 
    more » « less
  2. In this paper we propose a Highly Flexible InMemory (HieIM) computing platform using STT MRAM, which can be leveraged to implement Boolean logic functions without sacrificing memory functionality. It could pre-process data within memory to further reduce power hungry long distance communication between memory and processing units as in Von-Neumann computing system. HieIM can implement all the Boolean logic functions (AND/NAND, OR/NOR, XOR/XNOR) between any two cells in the same memory array, thus overcoming the `operand locality' problem in contemporary in-memory computing platform designs. To investigate the performance of HieIM, we test in-memory bulk bit-wise Boolean logic operations using different vector datasets, which shows ~ 8x energy saving and ~ 5x speedup compared to recent DRAM based in-memory computing platform. We further implement an in-memory data encryption engine design based on HieIM as another case study. With AES algorithm, it shows 51.5% and 68.9% lower energy consumption compared to CMOS-ASIC and CMOL based implementations, respectively. 
    more » « less
  3. In-memory processing offers a promising solution for enhancing the performance of data-intensive applications. While analog in-memory computing demonstrates remarkable efficiency, its limited precision is suitable only for approximate computing tasks. In contrast, digital in-memory computing delivers the deterministic precision necessary to accelerate high-assurance applications. Current digital in-memory computing methods typically involve manually breaking down arithmetic operations into in-memory compute kernels. In contrast, traditional digital circuits are synthesized through intricate and automated design workflows. In this article, we introduce a logic synthesis framework called LOGIC, which facilitates the translation of high-level applications into digital in-memory compute kernels that can be executed using non-volatile memory. We propose techniques for decomposing element-wise arithmetic operations into in-memory kernels while minimizing the number of in-memory operations. Additionally, we optimize the sequence of in-memory operations to reduce non-volatile memory utilization. To address the NP-hard execution sequencing optimization problem, we have developed twolook-aheadalgorithms that offer practical solutions. Additionally, we leverage data layout reorganization to efficiently accelerate applications that heavily rely on sparse matrix-vector multiplication operations. Our experimental evaluations demonstrate that our proposed synthesis approach improves the area and latency of fixed-point multiplication by 84% and 20% compared to the state-of-the-art, respectively. Moreover, when applied to scientific computing applications sourced from the SuiteSparse Matrix Collection, our design achieves remarkable improvements in area, latency, and energy efficiency by factors of 4.8×, 2.6×, and 11×, respectively. 
    more » « less
  4. In this paper, we propose an energy-efficient reconfigurable platform for in-memory processing based on novel 4-terminal spin Hall effect-driven domain wall motion devices that could be employed as both non-volatile memory cell and in-memory logic unit. The proposed designs lead to unity of memory and logic. The device to system level simulation results show that, with 28% area increase in memory structure, the proposed in-memory processing platform achieves a write energy ~15.6 fJ/bit with 79% reduction compared to that of SOT-MRAM counterpart while keeping the identical 1ns writing speed. In addition, the proposed in-memory logic scheme improves the operating energy by 61.3%, as compared with the recent non-volatile in-memory logic designs. An extensive reliability analysis is also performed over the proposed circuits. We employ Advanced Encryption Standard (AES) algorithm as a case study to elucidate the efficiency of the proposed platform at application level. Simulation results exhibit that the proposed platform can show up to 75.7% and 30.4% lower energy consumption compared to CMOS-ASIC and recent pipelined domain wall (DW) AES implementations, respectively. In addition, the AES Energy-Delay Product (EDP) can show 15.1% and 6.1% improvements compared to the DW-AES and CMOS-ASIC implementations, respectively. 
    more » « less
  5. The HSC-FPGA offers an intriguing feasible architecture for the next generation of configurable fabrics, which allows embracing the advantages of both CMOS and beyond-CMOS technologies without requiring significant modification to the routing structure, programming paradigms, and synthesis tool-chain of the commercial FPGAs. In the HSC-FPGA, the intrinsic characteristics of magnetic random access memory (MRAM)-look-up table (LUT) circuits are used to implement sequential logic, while combinational logic circuits are implemented by static random access memory (SRAM)-LUTs. Fabric-level simulation results for the developed HSC-FPGA show that it can achieve at least 18%, 70%, and 15% reduction in terms of area, standby power, and read power consumption, respectively, for various ISCAS-89 and ITC-99 benchmark circuits compared to conventional SRAM-based FPGAs. The power consumption values can be further decreased by the power-gating allowed by the non-volatility feature of MRAM-LUTs. Moreover, the benefits of increased heterogeneity for reconfigurable computing is extended along realizing probabilistic computing paradigms within a fabric, which is enabled by probabilistic spin logic devices. The cooperating strengths of technology-heterogeneity and heterogeneity in computing paradigm in the proposed HSC-FPGA are leveraged to develop energy-efficient and reliability-aware training and evaluation circuits for deep belief networks with memristive crossbar arrays and p-bit based probabilistic neurons. 
    more » « less