System and Design Technology Co-Optimization of SOT-MRAM for High-Performance AI Accelerator Memory System

Mishty, Kaniz; Sadi, Mehdi

doi:10.1109/TCAD.2023.3333754

Citation Details

System and Design Technology Co-Optimization of SOT-MRAM for High-Performance AI Accelerator Memory System

System on chips (SoCs) are now designed with their own artificial intelligence (AI) accelerator segment to accommodate the ever-increasing demand of deep learning (DL) applications. With powerful multiply and accumulate (MAC) engines for matrix multiplications, these accelerators show high computing performance. However, because of limited memory resources (i.e., bandwidth and capacity), they fail to achieve optimum system performance during large batch training and inference. In this work, we propose a memory system with high on-chip capacity and bandwidth to shift the gear of AI accelerators from memory-bound to achieving system-level peak performance. We develop the memory system with design technology co-optimization (DTCO)-enabled customized spin-orbit torque (SOT)-MRAM as large on-chip memory through system technology co-optimization (STCO) and detailed characterization of the DL workloads. Our workload-aware memory system achieves 8× energy and 9× latency improvement on computer vision (CV) benchmarks in training and 8× energy and 4.5× latency improvement on natural language processing (NLP) benchmarks in training while consuming only around 50% of SRAM area at iso-capacity. more »

Award ID(s):: 2153394

PAR ID:: 10504648

Author(s) / Creator(s):: Mishty, Kaniz; Sadi, Mehdi

Publisher / Repository:: IEEE

Date Published:: 2024-04-01

Journal Name:: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Volume:: 43

Issue:: 4

ISSN:: 0278-0070

Page Range / eLocation ID:: 1065 to 1078

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1109/TCAD.2023.3333754

More Like this