Non-uniform DNN Structured Subnets Sampling for Dynamic Inference

Yang, Li; He, Zhezhi; Cao, Yu; Fan, Deliang

doi:10.1109/DAC18072.2020.9218736

Citation Details

Non-uniform DNN Structured Subnets Sampling for Dynamic Inference

With the success of Deep Neural Networks (DNN), many recent works have been focusing on developing hardware accelerator for power and resource-limited system via model compression techniques, such as quantization, pruning, low-rank approximation and etc. However, almost all existing compressed DNNs are fixed after deployment, which lacks run-time adaptive structure to adapt to its dynamic hardware resource allocation, power budget, throughput requirement, as well as dynamic workload. As the countermeasure, to construct a novel run-time dynamic DNN structure, we propose a novel DNN sub-network sampling method via non-uniform channel selection for subnets generation. Thus, user can trade off between power, speed, computing load and accuracy on-the-fly after the deployment, depending on the dynamic requirements or specifications of the given system. We verify the proposed model on both CIFAR-10 and ImageNet dataset using ResNets, which outperforms the same sub-nets trained individually and other related works. It shows that, our method can achieve latency trade-off among 13.4, 24.6, 41.3, 62.1(ms) and 30.5, 38.7, 51, 65.4(ms) for GPU with 128 batch-size and CPU respectively on ImageNet using ResNet18. more »

Award ID(s):: 2005209 1931871

PAR ID:: 10295337

Author(s) / Creator(s):: Yang, Li; He, Zhezhi; Cao, Yu; Fan, Deliang

Date Published:: 2020-07-20

Journal Name:: 2020 57th ACM/IEEE Design Automation Conference (DAC)

Page Range / eLocation ID:: 1 to 6

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/DAC18072.2020.9218736

More Like this