ParaPIM: a parallel processing-in-memory accelerator for binary-weight deep neural networks

Angizi, Shaahin; He, Zhezhi; Fan, Deliang

doi:10.1145/3287624.3287644

Citation Details

ParaPIM: a parallel processing-in-memory accelerator for binary-weight deep neural networks

Recent algorithmic progression has brought competitive classification accuracy despite constraining neural networks to binary weights (+1/-1). These findings show remarkable optimization opportunities to eliminate the need for computationally-intensive multiplications, reducing memory access and storage. In this paper, we present ParaPIM architecture, which transforms current Spin Orbit Torque Magnetic Random Access Memory (SOT-MRAM) sub-arrays to massively parallel computational units capable of running inferences for Binary-Weight Deep Neural Networks (BWNNs). ParaPIM's in-situ computing architecture can be leveraged to greatly reduce energy consumption dealing with convolutional layers, accelerate BWNNs inference, eliminate unnecessary off-chip accesses and provide ultra-high internal bandwidth. The device-to-architecture co-simulation results indicate ~4x higher energy efficiency and 7.3x speedup over recent processing-in-DRAM acceleration, or roughly 5x higher energy-efficiency and 20.5x speedup over recent ASIC approaches, while maintaining inference accuracy comparable to baseline designs. more »

Award ID(s):: 1740126

PAR ID:: 10094200

Author(s) / Creator(s):: Angizi, Shaahin; He, Zhezhi; Fan, Deliang

Date Published:: 2019-01-21

Journal Name:: 24th Asia and South Pacific Design Automation Conference

Page Range / eLocation ID:: 127 to 132

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3287624.3287644

More Like this