skip to main content

Title: Parameter choices for sparse regularization with the ℓ1 norm *
Abstract We consider a regularization problem whose objective function consists of a convex fidelity term and a regularization term determined by the ℓ 1 norm composed with a linear transform. Empirical results show that the regularization with the ℓ 1 norm can promote sparsity of a regularized solution. The goal of this paper is to understand theoretically the effect of the regularization parameter on the sparsity of the regularized solutions. We establish a characterization of the sparsity under the transform matrix of the solution. When the objective function is block-separable or an error bound of the regularized solution to a known function is available, the resulting characterization can be taken as a regularization parameter choice strategy with which the regularization problem has a solution having a sparsity of a certain level. When the objective function is not block-separable, we propose an iterative algorithm which simultaneously determines the regularization parameter and its corresponding solution with a prescribed sparsity level. Moreover, we study choices of the regularization parameter so that the regularization term can alleviate the ill-posedness and promote sparsity of the resulting regularized solution. Numerical experiments demonstrate that the proposed algorithm is effective and efficient, and the choices of the regularization parameters can balance the sparsity of the regularized solution and its approximation to the minimizer of the fidelity function.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Inverse Problems
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider a minimization problem whose objective function is the sum of a fidelity term, not necessarily convex, and a regularization term defined by a positive regularization parameter [Formula: see text] multiple of the [Formula: see text] norm composed with a linear transform. This problem has wide applications in compressed sensing, sparse machine learning and image reconstruction. The goal of this paper is to understand what choices of the regularization parameter can dictate the level of sparsity under the transform for a global minimizer of the resulting regularized objective function. This is a critical issue but it has been left unaddressed. We address it from a geometric viewpoint with which the sparsity partition of the image space of the transform is introduced. Choices of the regularization parameter are specified to ensure that a global minimizer of the corresponding regularized objective function achieves a prescribed level of sparsity under the transform. Results are obtained for the spacial sparsity case in which the transform is the identity map, a case that covers several applications of practical importance, including machine learning, image/signal processing and medical image reconstruction. 
    more » « less
  2. Abstract The goal of this study is to develop a new computed tomography (CT) image reconstruction method, aiming at improving the quality of the reconstructed images of existing methods while reducing computational costs. Existing CT reconstruction is modeled by pixel-based piecewise constant approximations of the integral equation that describes the CT projection data acquisition process. Using these approximations imposes a bottleneck model error and results in a discrete system of a large size. We propose to develop a content-adaptive unstructured grid (CAUG) based regularized CT reconstruction method to address these issues. Specifically, we design a CAUG of the image domain to sparsely represent the underlying image, and introduce a CAUG-based piecewise linear approximation of the integral equation by employing a collocation method. We further apply a regularization defined on the CAUG for the resulting ill-posed linear system, which may lead to a sparse linear representation for the underlying solution. The regularized CT reconstruction is formulated as a convex optimization problem, whose objective function consists of a weighted least square norm based fidelity term, a regularization term and a constraint term. Here, the corresponding weighted matrix is derived from the simultaneous algebraic reconstruction technique (SART). We then develop a SART-type preconditioned fixed-point proximity algorithm to solve the optimization problem. Convergence analysis is provided for the resulting iterative algorithm. Numerical experiments demonstrate the superiority of the proposed method over several existing methods in terms of both suppressing noise and reducing computational costs. These methods include the SART without regularization and with the quadratic regularization, the traditional total variation (TV) regularized reconstruction method and the TV superiorized conjugate gradient method on the pixel grid. 
    more » « less
  3. We develop a projected Nesterov’s proximal-gradient (PNPG) approach for sparse signal reconstruction that combines adaptive step size with Nesterov’s momentum acceleration. The objective function that we wish to minimize is the sum of a convex differentiable data-fidelity (negative log-likelihood (NLL)) term and a convex regularization term. We apply sparse signal regularization where the signal belongs to a closed convex set within the closure of the domain of the NLL; the convex-set constraint facilitates flexible NLL domains and accurate signal recovery. Signal sparsity is imposed using the ℓ₁-norm penalty on the signal’s linear transform coefficients. The PNPG approach employs a projected Nesterov’s acceleration step with restart and a duality-based inner iteration to compute the proximal mapping. We propose an adaptive step-size selection scheme to obtain a good local majorizing function of the NLL and reduce the time spent backtracking. Thanks to step-size adaptation, PNPG converges faster than the methods that do not adjust to the local curvature of the NLL. We present an integrated derivation of the momentum acceleration and proofs of O(k⁻²) objective function convergence rate and convergence of the iterates, which account for adaptive step size, inexactness of the iterative proximal mapping, and the convex-set constraint. The tuning of PNPG is largely application independent. Tomographic and compressed-sensing reconstruction experiments with Poisson generalized linear and Gaussian linear measurement models demonstrate the performance of the proposed approach. 
    more » « less
  4. We develop a framework for reconstructing images that are sparse in an appropriate transform domain from polychromatic computed tomography (CT) measurements under the blind scenario where the material of the inspected object and incident-energy spectrum are unknown. Assuming that the object that we wish to reconstruct consists of a single material, we obtain a parsimonious measurement-model parameterization by changing the integral variable from photon energy to mass attenuation, which allows us to combine the variations brought by the unknown incident spectrum and mass attenuation into a single unknown mass-attenuation spectrum function; the resulting measurement equation has the Laplace-integral form. The mass-attenuation spectrum is then expanded into basis functions using B splines of order one. We consider a Poisson noise model and establish conditions for biconvexity of the corresponding negative log-likelihood (NLL) function with respect to the density-map and mass-attenuation spectrum parameters. We derive a block-coordinate descent algorithm for constrained minimization of a penalized NLL objective function, where penalty terms ensure nonnegativity of the mass-attenuation spline coefficients and nonnegativity and gradient-map sparsity of the density-map image, imposed using a convex total-variation (TV) norm; the resulting objective function is biconvex. This algorithm alternates between a Nesterov’s proximal-gradient (NPG) step and a limited-memory Broyden-Fletcher-Goldfarb-Shanno with box constraints (L-BFGS-B) iteration for updating the image and mass-attenuation spectrum parameters, respectively. We prove the Kurdyka-Łojasiewicz property of the objective function, which is important for establishing local convergence of block-coordinate descent schemes in biconvex optimization problems. Our framework applies to other NLLs and signal-sparsity penalties, such as lognormal NLL and ℓ₁ norm of 2D discrete wavelet transform (DWT) image coefficients. Numerical experiments with simulated and real X-ray CT data demonstrate the performance of the proposed scheme. 
    more » « less
  5. Abstract

    This paper studies several solution paths of sparse quadratic minimization problems as a function of the weighing parameter of the bi-objective of estimation loss versus solution sparsity. Three such paths are considered: the “$$\ell _0$$0-path” where the discontinuous$$\ell _0$$0-function provides the exact sparsity count; the “$$\ell _1$$1-path” where the$$\ell _1$$1-function provides a convex surrogate of sparsity count; and the “capped$$\ell _1$$1-path” where the nonconvex nondifferentiable capped$$\ell _1$$1-function aims to enhance the$$\ell _1$$1-approximation. Serving different purposes, each of these three formulations is different from each other, both analytically and computationally. Our results deepen the understanding of (old and new) properties of the associated paths, highlight the pros, cons, and tradeoffs of these sparse optimization models, and provide numerical evidence to support the practical superiority of the capped$$\ell _1$$1-path. Our study of the capped$$\ell _1$$1-path is interesting in its own right as the path pertains to computable directionally stationary (= strongly locally minimizing in this context, as opposed to globally optimal) solutions of a parametric nonconvex nondifferentiable optimization problem. Motivated by classical parametric quadratic programming theory and reinforced by modern statistical learning studies, both casting an exponential perspective in fully describing such solution paths, we also aim to address the question of whether some of them can be fully traced in strongly polynomial time in the problem dimensions. A major conclusion of this paper is that a path of directional stationary solutions of the capped$$\ell _1$$1-regularized problem offers interesting theoretical properties and practical compromise between the$$\ell _0$$0-path and the$$\ell _1$$1-path. Indeed, while the$$\ell _0$$0-path is computationally prohibitive and greatly handicapped by the repeated solution of mixed-integer nonlinear programs, the quality of$$\ell _1$$1-path, in terms of the two criteria—loss and sparsity—in the estimation objective, is inferior to the capped$$\ell _1$$1-path; the latter can be obtained efficiently by a combination of a parametric pivoting-like scheme supplemented by an algorithm that takes advantage of the Z-matrix structure of the loss function.

    more » « less