Accelerating Low Bit-Width Deep Convolution Neural Network in MRAM

He, Zhezhi; Angizi, Shaahin; Fan, Deliang

doi:10.1109/ISVLSI.2018.00103

Citation Details

Accelerating Low Bit-Width Deep Convolution Neural Network in MRAM

Deep Convolution Neural Network (CNN) has achieved outstanding performance in image recognition over large scale dataset. However, pursuit of higher inference accuracy leads to CNN architecture with deeper layers and denser connections, which inevitably makes its hardware implementation demand more and more memory and computational resources. It can be interpreted as `CNN power and memory wall'. Recent research efforts have significantly reduced both model size and computational complexity by using low bit-width weights, activations and gradients, while keeping reasonably good accuracy. In this work, we present different emerging nonvolatile Magnetic Random Access Memory (MRAM) designs that could be leveraged to implement `bit-wise in-memory convolution engine', which could simultaneously store network parameters and compute low bit-width convolution. Such new computing model leverages the `in-memory computing' concept to accelerate CNN inference and reduce convolution energy consumption due to intrinsic logic-in-memory design and reduction of data communication. more »

Award ID(s):: 1740126

PAR ID:: 10094196

Author(s) / Creator(s):: He, Zhezhi; Angizi, Shaahin; Fan, Deliang

Date Published:: 2018-07-08

Journal Name:: 2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Page Range / eLocation ID:: 533 to 538

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/ISVLSI.2018.00103

More Like this