RTMobile: Beyond Real-Time Mobile Acceleration of RNNs for Speech Recognition

Dong, Peiyan; Wang, Siyue; Niu, Wei; Zhang, Chengming; Lin, Sheng; Li, Zhengang; Gong, Yifan; Ren, Bin; Lin, Xue; Tao, Dingwen

doi:10.1109/DAC18072.2020.9218499

Citation Details

RTMobile: Beyond Real-Time Mobile Acceleration of RNNs for Speech Recognition

Recurrent neural networks (RNNs) based automatic speech recognition has nowadays become promising and important on mobile devices such as smart phones. However, previous RNN compression techniques either suffer from hardware performance overhead due to irregularity or significant accuracy loss due to the preserved regularity for hardware friendliness. In this work, we propose RTMobile that leverages both a novel block-based pruning approach and compiler optimizations to accelerate RNN inference on mobile devices. Our proposed RTMobile is the first work that can achieve real-time RNN inference on mobile platforms. Experimental results demonstrate that RTMobile can significantly outperform existing RNN hardware acceleration methods in terms of both inference accuracy and time. Compared with prior work on FPGA, RTMobile using Adreno 640 embedded GPU on GRU can improve the energy efficiency by 40x while maintaining the same inference time. more »

Award ID(s):: 2034169 1948447 2303820

NSF-PAR ID:: 10158380

Author(s) / Creator(s):: Dong, Peiyan; Wang, Siyue; Niu, Wei; Zhang, Chengming; Lin, Sheng; Li, Zhengang; Gong, Yifan; Ren, Bin; Lin, Xue; Tao, Dingwen

Date Published:: 2020-07-01

Journal Name:: The 57th Annual Design Automation Conference (DAC 2020)

Page Range / eLocation ID:: 1 to 6

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/DAC18072.2020.9218499

More Like this