Tabacu, Lucia
(Ed.)
Convolutional neural networks (CNN) have been hugely successful recently with superior
accuracy and performance in various imaging applications, such as classification, object
detection, and segmentation. However, a highly accurate CNN model requires millions of
parameters to be trained and utilized. Even to increase its performance slightly would
require significantly more parameters due to adding more layers and/or increasing the
number of filters per layer. Apparently, many of these weight parameters turn out to be
redundant and extraneous, so the original, dense model can be replaced by its
compressed version attained by imposing inter- and intra-group sparsity onto the layer
weights during training. In this paper, we propose a nonconvex family of sparse group
lasso that blends nonconvex regularization (e.g., transformed ℓ1, ℓ1 − ℓ2, and ℓ0) that
induces sparsity onto the individual weights and ℓ2,1 regularization onto the output
channels of a layer. We apply variable splitting onto the proposed regularization to
develop an algorithm that consists of two steps per iteration: gradient descent and
thresholding. Numerical experiments are demonstrated on various CNN architectures
showcasing the effectiveness of the nonconvex family of sparse group lasso in network
sparsification and test accuracy on par with the current state of the art.
more »
« less