Latent Weight-Based Pruning for Small Binary Neural Networks

Chen, Tianen; Anderson, Noah; Kim, Younghyun

doi:10.1145/3566097.3567873

Citation Details

Latent Weight-Based Pruning for Small Binary Neural Networks

Binary neural networks (BNNs) substitute complex arithmetic operations with simple bit-wise operations. The binarized weights and activations in BNNs can drastically reduce memory requirement and energy consumption, making it attractive for edge ML applications with limited resources. However, the severe memory capacity and energy constraints of low-power edge devices call for further reduction of BNN models beyond binarization. Weight pruning is a proven solution for reducing the size of many neural network (NN) models, but the binary nature of BNN weights make it difficult to identify insignificant weights to remove. In this paper, we present a pruning method based on latent weight with layer-level pruning sensitivity analysis which reduces the over-parameterization of BNNs, allowing for accuracy gains while drastically reducing the model size. Our method advocates for a heuristics that distinguishes weights by their latent weights, a real-valued vector used to compute the pseudogradient during backpropagation. It is tested using three different convolutional NNs on the MNIST, CIFAR-10, and Imagenette datasets with results indicating a 33%--46% reduction in operation count, with no accuracy loss, improving upon previous works in accuracy, model size, and total operation count. more »

Award ID(s):: 1845469

PAR ID:: 10410120

Author(s) / Creator(s):: Chen, Tianen; Anderson, Noah; Kim, Younghyun

Date Published:: 2023-01-16

Journal Name:: ASPDAC '23: Proceedings of the 28th Asia and South Pacific Design Automation Conference

Page Range / eLocation ID:: 751 - 756

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3566097.3567873

More Like this