A Hardware Prototype Targeting Distributed Deep Learning for On-device Inference

Farcas, Allen-Jasmin; Li, Guihong; Bhardwaj, Kartikeya; Marculescu, Radu

doi:10.1109/CVPRW50498.2020.00207

Citation Details

A Hardware Prototype Targeting Distributed Deep Learning for On-device Inference

This paper presents a hardware prototype and a framework for a new communication-aware model compression for distributed on-device inference. Our approach relies on Knowledge Distillation (KD) and achieves orders of magnitude compression ratios on a large pre-trained teacher model. The distributed hardware prototype consists of multiple student models deployed on Raspberry-Pi 3 nodes that run Wide ResNet and VGG models on the CIFAR10 dataset for real-time image classification. We observe significant reductions in memory footprint (50×), energy consumption (14×), latency (33×) and an increase in performance (12×) without any significant accuracy loss compared to the initial teacher model. This is an important step towards deploying deep learning models for IoT applications. more »

Award ID(s):: 2007284

PAR ID:: 10298334

Author(s) / Creator(s):: Farcas, Allen-Jasmin; Li, Guihong; Bhardwaj, Kartikeya; Marculescu, Radu

Date Published:: 2020-06-01

Journal Name:: Computer Vision and Pattern Recognition

Page Range / eLocation ID:: 1600 to 1601

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript
Conference Paper:
https://doi.org/10.1109/CVPRW50498.2020.00207

More Like this