Mobile-3DCNN: An Acceleration Framework for Ultra-Real-Time Execution of Large 3D CNNs on Mobile Devices

Niu, Wei; Sun, Mengshu; Li, Zhengang; Chen, Jou-An; Guan, Jiexiong; Shen, Xipeng; Liu, Jun; Zhang, Mei; Wang, Yanzhi; Lin, Xue; Ren, Bin

doi:10.1145/3747842

Citation Details

Mobile-3DCNN: An Acceleration Framework for Ultra-Real-Time Execution of Large 3D CNNs on Mobile Devices

It is challenging to deploy 3D Convolutional Neural Networks (3D CNNs) on mobile devices, specifically if both real-time execution and high inference accuracy are in demand, because the increasingly large model size and complex model structure of 3D CNNs usually require tremendous computation and memory resources. Weight pruning is proposed to mitigate this challenge. However, existing pruning is either not compatible with modern parallel architectures, resulting in long inference latency or subject to significant accuracy degradation. This paper proposes an end-to-end 3D CNN acceleration framework based on pruning/compilation co-design called Mobile-3DCNN that consists of two parts: a novel, fine-grained structured pruning enhanced by a prune/Winograd adaptive selection (that is mobile-hardware-friendly and can achieve high pruning accuracy), and a set of compiler optimization and code generation techniques enabled by our pruning (to fully transform the pruning benefit to real performance gains). The evaluation demonstrates that Mobile-3DCNN outperforms state-of-the-art end-to-end DNN acceleration frameworks that support 3D CNN execution on mobile devices, Alibaba Mobile Neural Networks and Pytorch-Mobile with speedup up to 34 × with minor accuracy degradation, proving it is possible to execute high-accuracy large 3D CNNs on mobile devices in real-time (or even ultra-real-time). more »

Award ID(s):: 2047516 2428108 2403090 2403088

PAR ID:: 10625086

Author(s) / Creator(s):: Niu, Wei; Sun, Mengshu; Li, Zhengang; Chen, Jou-An; Guan, Jiexiong; Shen, Xipeng; Liu, Jun; Zhang, Mei; Wang, Yanzhi; Lin, Xue; Ren, Bin

Publisher / Repository:: ACM

Date Published:: 2025-07-22

Journal Name:: ACM Transactions on Architecture and Code Optimization

ISSN:: 1544-3566

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript
Journal Article:
https://doi.org/10.1145/3747842

More Like this