Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract We studyinexactfixed-point proximity algorithms for solving a class of sparse regularization problems involving the$$\ell _0$$ norm. Specifically, the$$\ell _0$$ model has an objective function that is the sum of a convex fidelity term and a Moreau envelope of the$$\ell _0$$ norm regularization term. Such an$$\ell _0$$ model is non-convex. Existing exact algorithms for solving the problems require the availability of closed-form formulas for the proximity operator of convex functions involved in the objective function. When such formulas are not available, numerical computation of the proximity operator becomes inevitable. This leads to inexact iteration algorithms. We investigate in this paper how the numerical error for every step of the iteration should be controlled to ensure global convergence of the inexact algorithms. We establish a theoretical result that guarantees the sequence generated by the proposed inexact algorithm converges to a local minimizer of the optimization problem. We implement the proposed algorithms for three applications of practical importance in machine learning and image science, which include regression, classification, and image deblurring. The numerical results demonstrate the convergence of the proposed algorithm and confirm that local minimizers of the$$\ell _0$$ models found by the proposed inexact algorithm outperform global minimizers of the corresponding$$\ell _1$$ models, in terms of approximation accuracy and sparsity of the solutions.more » « less
-
Regularized learning problems in Banach spaces, which often minimize the sum of a data fidelity term in one Banach norm and a regularization term in another Banach norm, is challenging to solve. We construct a direct sum space based on the Banach spaces for the fidelity term and the regularization term and recast the objective function as the norm of a quotient space of the direct sum space. We then express the original regularized problem as an optimization problem in the dual space of the direct sum space. It is to find the maximum of a linear function on a convex polytope, which may be solved by linear programming. A solution of the original problem is then obtained by using related extremal properties of norming functionals from a solution of the dual problem. Numerical experiments demonstrate that the proposed duality approach is effective for solving the regularization learning problems.more » « less
-
We studied the use of deep neural networks (DNNs) in the numerical solution of the oscillatory Fredholm integral equation of the second kind. It is known that the solution of the equation exhibits certain oscillatory behaviors due to the oscillation of the kernel. It was pointed out recently that standard DNNs favor low frequency functions, and as a result, they often produce poor approximation for functions containing high frequency components. We addressed this issue in this study. We first developed a numerical method for solving the equation with DNNs as an approximate solution by designing a numerical quadrature that tailors to computing oscillatory integrals involving DNNs. We proved that the error of the DNN approximate solution of the equation is bounded by the training loss and the quadrature error. We then proposed a multigrade deep learning (MGDL) model to overcome the spectral bias issue of neural networks. Numerical experiments demonstrate that the MGDL model is effective in extracting multiscale information of the oscillatory solution and overcoming the spectral bias issue from which a standard DNN model suffers.more » « less
-
Sparsity of a learning solution is a desirable feature in machine learning. Certain reproducing kernel Banach spaces (RKBSs) are appropriate hypothesis spaces for sparse learning methods. The goal of this paper is to understand what kind of RKBSs can promote sparsity for learning solutions. We consider two typical learning models in an RKBS: the minimum norm interpolation (MNI) problem and the regularization problem. We first establish an explicit representer theorem for solutions of these problems, which represents the extreme points of the solution set by a linear combination of the extreme points of the subdifferential set, of the norm function, which is data-dependent. We then propose sufficient conditions on the RKBS that can transform the explicit representation of the solutions to a sparse kernel representation having fewer terms than the number of the observed data. Under the proposed sufficient conditions, we investigate the role of the regularization parameter on sparsity of the regularized solutions. We further show that two specific RKBSs, the sequence space l_1(N) and the measure space, can have sparse representer theorems for both MNI and regularization models.more » « less
-
We explore convergence of deep neural networks with the popular ReLU activation function, as the depth of the networks tends to infinity. To this end, we introduce the notion of activation domains and activation matrices of a ReLU network. By replacing applications of the ReLU activation function by multiplications with activation matrices on activation domains, we obtain an explicit expression of the ReLU network. We then identify the convergence of the ReLU networks as convergence of a class of infinite products of matrices. Sufficient and necessary conditions for convergence of these infinite products of matrices are studied. As a result, we establish necessary conditions for ReLU networks to converge that the sequence of weight matrices converges to the identity matrix and the sequence of the bias vectors converges to zero as the depth of ReLU networks increases to infinity. Moreover, we obtain sufficient conditions in terms of the weight matrices and bias vectors at hidden layers for pointwise convergence of deep ReLU networks. These results provide mathematical insights to convergence of deep neural networks. Experiments are conducted to mathematically verify the results and to illustrate their potential usefulness in initialization of deep neural networks.more » « less
-
Abstract We consider a regularization problem whose objective function consists of a convex fidelity term and a regularization term determined by the ℓ 1 norm composed with a linear transform. Empirical results show that the regularization with the ℓ 1 norm can promote sparsity of a regularized solution. The goal of this paper is to understand theoretically the effect of the regularization parameter on the sparsity of the regularized solutions. We establish a characterization of the sparsity under the transform matrix of the solution. When the objective function is block-separable or an error bound of the regularized solution to a known function is available, the resulting characterization can be taken as a regularization parameter choice strategy with which the regularization problem has a solution having a sparsity of a certain level. When the objective function is not block-separable, we propose an iterative algorithm which simultaneously determines the regularization parameter and its corresponding solution with a prescribed sparsity level. Moreover, we study choices of the regularization parameter so that the regularization term can alleviate the ill-posedness and promote sparsity of the resulting regularized solution. Numerical experiments demonstrate that the proposed algorithm is effective and efficient, and the choices of the regularization parameters can balance the sparsity of the regularized solution and its approximation to the minimizer of the fidelity function.more » « less