Energy-Efficient LSTM Inference Accelerator for Real-Time Causal Prediction

Chen, Zhe; Blair, Hugh T.; Cong, Jason

doi:10.1145/3495006

Citation Details

Energy-Efficient LSTM Inference Accelerator for Real-Time Causal Prediction

Ever-growing edge applications often require short processing latency and high energy efficiency to meet strict timing and power budget. In this work, we propose that the compact long short-term memory (LSTM) model can approximate conventional acausal algorithms with reduced latency and improved efficiency for real-time causal prediction, especially for the neural signal processing in closed-loop feedback applications. We design an LSTM inference accelerator by taking advantage of the fine-grained parallelism and pipelined feedforward and recurrent updates. We also propose a bit-sparse quantization method that can reduce the circuit area and power consumption by replacing the multipliers with the bit-shift operators. We explore different combinations of pruning and quantization methods for energy-efficient LSTM inference on datasets collected from the electroencephalogram (EEG) and calcium image processing applications. Evaluation results show that our proposed LSTM inference accelerator can achieve 1.19 GOPS/mW energy efficiency. The LSTM accelerator with 2-sbit/16-bit sparse quantization and 60% sparsity can reduce the circuit area and power consumption by 54.1% and 56.3%, respectively, compared with a 16-bit baseline implementation. more »

Award ID(s):: 1707408

PAR ID:: 10337228

Author(s) / Creator(s):: Chen, Zhe; Blair, Hugh T.; Cong, Jason

Date Published:: 2022-09-30

Journal Name:: ACM Transactions on Design Automation of Electronic Systems

Volume:: 27

Issue:: 5

ISSN:: 1084-4309

Page Range / eLocation ID:: 1 to 19

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1145/3495006

More Like this