skip to main content


Title: Architecting SOT-RAM Based GPU Register File
With the increase in GPU register file (RF) size, its power consumption has also increased. Since RF exists at the highest level in cache hierarchy, designing it with memories with high write latency/energy (e.g., spin transfer torque RAM) can lead to large energy loss. In this paper, we present an spin orbit torque RAM (SOT-RAM) based RF design which provides higher energy efficiency than SRAM and STT-RAM RFs while maintaining performance same as that of SRAM RF. To further improve energy efficiency of SOT-RAM based RF, we propose avoiding redundant bit-writes to RF. Compared to SRAM RF, SOT-RAM RF saves 18.6% energy and by using our technique for avoiding redundant writes, the energy saving can be increased to 44.3%, without harming performance.  more » « less
Award ID(s):
1657336
NSF-PAR ID:
10048264
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
IEEE Computer Society Annual Symposium on VLSI (ISVLSI)
Page Range / eLocation ID:
38 to 44
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Non-volatile memory (NVM) technologies such as spin-transfer torque magnetic random access memory (STT-MRAM) and spin-orbit torque magnetic random access memory (SOT-MRAM) have significant advantages compared to conventional SRAM due to their non-volatility, higher cell density, and scalability features. While previous work has investigated several architectural implications of NVM for generic applications, in this work we present DeepNVM, a framework to characterize, model, and analyze NVM-based caches in GPU architectures for deep learning (DL) applications by combining technologyspecific circuit-level models and the actual memory behavior of various DL workloads. We present both iso-capacity and isoarea performance and energy analysis for systems whose lastlevel caches rely on conventional SRAM and emerging STT-MRAM and SOT-MRAM technologies. In the iso-capacity case, STT-MRAM and SOT-MRAM provide up to 4.2× and 5× energy-delay product (EDP) reduction and 2.4× and 3× area reduction compared to conventional SRAM, respectively. Under iso-area assumptions, STT-MRAM and SOT-MRAM provide 2.3× EDP reduction on average across all workloads when compared to SRAM. Our comprehensive cross-layer framework is demonstrated on STT-/SOT-MRAM technologies and can be used for the characterization, modeling, and analysis of any NVM technology for last-level caches in GPU platforms for deep learning applications. 
    more » « less
  2. Abstract

    Spin Orbit Torque Magnetic RAM (SOT-MRAM) is emerging as a promising memory technology owing to its high endurance, reliability and speed. A critical factor for its success is the development of materials that exhibit efficient conversion of charge current to spin current, characterized by their spin Hall efficiency. In this work, it is experimentally demonstrated that the spin Hall efficiency of the industrially relevant ultra-thin Ta can be enhanced by more than 25× when a monolayer (ML) WSe2is inserted as an underlayer. The enhancement is attributed to spin absorption at the Ta/WSe2interface, suggested by harmonic Hall measurements. The presented hybrid spin Hall stack with a 2D WSe2underlayer has a total body thickness of less than 2 nm and exhibits greatly enhanced spin Hall efficiency, which makes this hybrid a promising candidate for energy efficient SOT-MRAM.

     
    more » « less
  3. Magneto-Electric FET ( MEFET ) is a recently developed post-CMOS FET, which offers intriguing characteristics for high-speed and low-power design in both logic and memory applications. In this article, we present MeF-RAM , a non-volatile cache memory design based on 2-Transistor-1-MEFET ( 2T1M ) memory bit-cell with separate read and write paths. We show that with proper co-design across MEFET device, memory cell circuit, and array architecture, MeF-RAM is a promising candidate for fast non-volatile memory ( NVM ). To evaluate its cache performance in the memory system, we, for the first time, build a device-to-architecture cross-layer evaluation framework to quantitatively analyze and benchmark the MeF-RAM design with other memory technologies, including both volatile memory (i.e., SRAM, eDRAM) and other popular non-volatile emerging memory (i.e., ReRAM, STT-MRAM, and SOT-MRAM). The experiment results for the PARSEC benchmark suite indicate that, as an L2 cache memory, MeF-RAM reduces Energy Area Latency ( EAT ) product on average by ~98% and ~70% compared with typical 6T-SRAM and 2T1R SOT-MRAM counterparts, respectively. 
    more » « less
  4. Prior studies have shown that the retention time of the non-volatile spin-transfer torque RAM (STT-RAM) can be relaxed in order to reduce STT-RAM's write energy and latency. However, since different applications may require different retention times, STT-RAM retention times must be critically explored to satisfy various applications' needs. This process can be challenging due to exploration overhead, and exacerbated by the fact that STT-RAM caches are emerging and are not readily available for design time exploration. This paper explores using known and easily obtainable statistics (e.g., SRAM statistics) to predict the appropriate STT-RAM retention times, in order to minimize exploration overhead. We propose an STT-RAM Cache Retention Time (SCART) model, which utilizes machine learning to enable design time or runtime prediction of right-provisioned STT-RAM retention times for latency or energy optimization. Experimental results show that, on average, SCART can reduce the latency and energy by 20.34% and 29.12%, respectively, compared to a homogeneous retention time while reducing the exploration overheads by 52.58% compared to prior work. 
    more » « less
  5. The emergence of embedded magnetic random-access memory (MRAM) and its integration in mainstream semiconductor manufacturing technology have created an unprecedented opportunity for engineering computing systems with improved performance, energy efficiency, lower cost, and unconventional computing capabilities. While the initial interest in the existing generation of MRAM—which is based on the spin-transfer torque (STT) effect in ferromagnetic tunnel junctions—was driven by its nonvolatile data retention and lower cost of integration compared to embedded Flash (eFlash), the focus of MRAM research and development efforts is increasingly shifting toward alternative write mechanisms (beyond STT) and new materials (beyond ferromagnets) in recent years. This has been driven by the need for better speed vs density and speed vs endurance trade-offs to make MRAM applicable to a wider range of memory markets, as well as to utilize the potential of MRAM in various unconventional computing architectures that utilize the physics of nanoscale magnets. In this Perspective, we offer an overview of spin–orbit torque (SOT) as one of these beyond-STT write mechanisms for the MRAM devices. We discuss, specifically, the progress in developing SOT-MRAM devices with perpendicular magnetization. Starting from basic symmetry considerations, we discuss the requirement for an in-plane bias magnetic field which has hindered progress in developing practical SOT-MRAM devices. We then discuss several approaches based on structural, magnetic, and chiral symmetry-breaking that have been explored to overcome this limitation and realize bias-field-free SOT-MRAM devices with perpendicular magnetization. We also review the corresponding material- and device-level challenges in each case. We then present a perspective of the potential of these devices for computing and security applications beyond their use in the conventional memory hierarchy. 
    more » « less