Convergence of a Relaxed Variable Splitting Method for Learning Sparse Neural Networks via L1, L0, and transformed-L1 Penalties

Dinh, Thu; Xin, Jack

Citation Details

Sparsification of neural networks is one of the effective complexity reduction methods to improve efficiency and generalizability. We consider the problem of learning a one hidden layer convolutional neural network with ReLU activation function via gradient descent under sparsity promoting penalties. It is known that when the input data is Gaussian distributed, no-overlap networks (without penalties) in regression problems with ground truth can be learned in polynomial time at high probability. We propose a relaxed variable splitting method integrating thresholding and gradient descent to overcome the non-smoothness in the loss function. The sparsity in network weight is realized during the optimization (training) process. We prove that under L1, L0, and transformed-L1 penalties, no-overlap networks can be learned with high probability, and the iterative weights converge to a global limit which is a transformation of the true weight under a novel thresholding operation. Numerical experiments confirm theoretical findings, and compare the accuracy and sparsity trade-off among the penalties. more »

Award ID(s):: 1854434 1632935

PAR ID:: 10158867

Author(s) / Creator(s):: Dinh, Thu; Xin, Jack

Date Published:: 2020-09-01

Journal Name:: Intelligent Systems Conference (IntelliSys)

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this