A Greedy Algorithm for Quantizing Neural Networks

Lybrand, Eric; Saab, Rayan

Citation Details

We propose a new computationally efficient method for quantizing the weights of pre- trained neural networks that is general enough to handle both multi-layer perceptrons and convolutional neural networks. Our method deterministically quantizes layers in an iterative fashion with no complicated re-training required. Specifically, we quantize each neuron, or hidden unit, using a greedy path-following algorithm. This simple algorithm is equivalent to running a dynamical system, which we prove is stable for quantizing a single-layer neural network (or, alternatively, for quantizing the first layer of a multi-layer network) when the training data are Gaussian. We show that under these assumptions, the quantization error decays with the width of the layer, i.e., its level of over-parametrization. We provide numerical experiments, on multi-layer networks, to illustrate the performance of our methods on MNIST and CIFAR10 data, as well as for quantizing the VGG16 network using ImageNet data. more »

Award ID(s):: 2012546

PAR ID:: 10345701

Author(s) / Creator(s):: Lybrand, Eric; Saab, Rayan

Date Published:: 2021-01-01

Journal Name:: Journal of machine learning research

Volume:: 22

ISSN:: 1533-7928

Page Range / eLocation ID:: 1 - 38

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
The DOI is not currently available.

More Like this