skip to main content


Title: Inexact Fixed-Point Proximity Algorithm for the $$\ell _0$$ Sparse Regularization Problem
Abstract

We studyinexactfixed-point proximity algorithms for solving a class of sparse regularization problems involving the$$\ell _0$$0norm. Specifically, the$$\ell _0$$0model has an objective function that is the sum of a convex fidelity term and a Moreau envelope of the$$\ell _0$$0norm regularization term. Such an$$\ell _0$$0model is non-convex. Existing exact algorithms for solving the problems require the availability of closed-form formulas for the proximity operator of convex functions involved in the objective function. When such formulas are not available, numerical computation of the proximity operator becomes inevitable. This leads to inexact iteration algorithms. We investigate in this paper how the numerical error for every step of the iteration should be controlled to ensure global convergence of the inexact algorithms. We establish a theoretical result that guarantees the sequence generated by the proposed inexact algorithm converges to a local minimizer of the optimization problem. We implement the proposed algorithms for three applications of practical importance in machine learning and image science, which include regression, classification, and image deblurring. The numerical results demonstrate the convergence of the proposed algorithm and confirm that local minimizers of the$$\ell _0$$0models found by the proposed inexact algorithm outperform global minimizers of the corresponding$$\ell _1$$1models, in terms of approximation accuracy and sparsity of the solutions.

 
more » « less
Award ID(s):
2208386
NSF-PAR ID:
10521854
Author(s) / Creator(s):
; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
Journal of Scientific Computing
Volume:
100
Issue:
2
ISSN:
0885-7474
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    This paper studies several solution paths of sparse quadratic minimization problems as a function of the weighing parameter of the bi-objective of estimation loss versus solution sparsity. Three such paths are considered: the “$$\ell _0$$0-path” where the discontinuous$$\ell _0$$0-function provides the exact sparsity count; the “$$\ell _1$$1-path” where the$$\ell _1$$1-function provides a convex surrogate of sparsity count; and the “capped$$\ell _1$$1-path” where the nonconvex nondifferentiable capped$$\ell _1$$1-function aims to enhance the$$\ell _1$$1-approximation. Serving different purposes, each of these three formulations is different from each other, both analytically and computationally. Our results deepen the understanding of (old and new) properties of the associated paths, highlight the pros, cons, and tradeoffs of these sparse optimization models, and provide numerical evidence to support the practical superiority of the capped$$\ell _1$$1-path. Our study of the capped$$\ell _1$$1-path is interesting in its own right as the path pertains to computable directionally stationary (= strongly locally minimizing in this context, as opposed to globally optimal) solutions of a parametric nonconvex nondifferentiable optimization problem. Motivated by classical parametric quadratic programming theory and reinforced by modern statistical learning studies, both casting an exponential perspective in fully describing such solution paths, we also aim to address the question of whether some of them can be fully traced in strongly polynomial time in the problem dimensions. A major conclusion of this paper is that a path of directional stationary solutions of the capped$$\ell _1$$1-regularized problem offers interesting theoretical properties and practical compromise between the$$\ell _0$$0-path and the$$\ell _1$$1-path. Indeed, while the$$\ell _0$$0-path is computationally prohibitive and greatly handicapped by the repeated solution of mixed-integer nonlinear programs, the quality of$$\ell _1$$1-path, in terms of the two criteria—loss and sparsity—in the estimation objective, is inferior to the capped$$\ell _1$$1-path; the latter can be obtained efficiently by a combination of a parametric pivoting-like scheme supplemented by an algorithm that takes advantage of the Z-matrix structure of the loss function.

     
    more » « less
  2. Abstract

    Sparsity finds applications in diverse areas such as statistics, machine learning, and signal processing. Computations over sparse structures are less complex compared to their dense counterparts and need less storage. This paper proposes a heuristic method for retrieving sparse approximate solutions of optimization problems via minimizing the$$\ell _{p}$$pquasi-norm, where$$00<p<1. An iterative two-block algorithm for minimizing the$$\ell _{p}$$pquasi-norm subject to convex constraints is proposed. The proposed algorithm requires solving for the roots of a scalar degree polynomial as opposed to applying a soft thresholding operator in the case of$$\ell _{1}$$1norm minimization. The algorithm’s merit relies on its ability to solve the$$\ell _{p}$$pquasi-norm minimization subject to any convex constraints set. For the specific case of constraints defined by differentiable functions with Lipschitz continuous gradient, a second, faster algorithm is proposed. Using a proximal gradient step, we mitigate the convex projection step and hence enhance the algorithm’s speed while proving its convergence. We present various applications where the proposed algorithm excels, namely, sparse signal reconstruction, system identification, and matrix completion. The results demonstrate the significant gains obtained by the proposed algorithm compared to other$$\ell _{p}$$pquasi-norm based methods presented in previous literature.

     
    more » « less
  3. Abstract

    We study the sparsity of the solutions to systems of linear Diophantine equations with and without non-negativity constraints. The sparsity of a solution vector is the number of its nonzero entries, which is referred to as the$$\ell _0$$0-norm of the vector. Our main results are new improved bounds on the minimal$$\ell _0$$0-norm of solutions to systems$$A\varvec{x}=\varvec{b}$$Ax=b, where$$A\in \mathbb {Z}^{m\times n}$$AZm×n,$${\varvec{b}}\in \mathbb {Z}^m$$bZmand$$\varvec{x}$$xis either a general integer vector (lattice case) or a non-negative integer vector (semigroup case). In certain cases, we give polynomial time algorithms for computing solutions with$$\ell _0$$0-norm satisfying the obtained bounds. We show that our bounds are tight. Our bounds can be seen as functions naturally generalizing the rank of a matrix over$$\mathbb {R}$$R, to other subdomains such as$$\mathbb {Z}$$Z. We show that these new rank-like functions are all NP-hard to compute in general, but polynomial-time computable for fixed number of variables.

     
    more » « less
  4. Abstract

    Ramanujan’s partition congruences modulo$$\ell \in \{5, 7, 11\}$${5,7,11}assert that$$\begin{aligned} p(\ell n+\delta _{\ell })\equiv 0\pmod {\ell }, \end{aligned}$$p(n+δ)0(mod),where$$0<\delta _{\ell }<\ell $$0<δ<satisfies$$24\delta _{\ell }\equiv 1\pmod {\ell }.$$24δ1(mod).By proving Subbarao’s conjecture, Radu showed that there are no such congruences when it comes to parity. There are infinitely many odd (resp. even) partition numbers in every arithmetic progression. For primes$$\ell \ge 5,$$5,we give a new proof of the conclusion that there are infinitely manymfor which$$p(\ell m+\delta _{\ell })$$p(m+δ)is odd. This proof uses a generalization, due to the second author and Ramsey, of a result of Mazur in his classic paper on the Eisenstein ideal. We also refine a classical criterion of Sturm for modular form congruences, which allows us to show that the smallest suchmsatisfies$$m<(\ell ^2-1)/24,$$m<(2-1)/24,representing a significant improvement to the previous bound.

     
    more » « less
  5. Abstract

    The double differential cross sections of the Drell–Yan lepton pair ($$\ell ^+\ell ^-$$+-, dielectron or dimuon) production are measured as functions of the invariant mass$$m_{\ell \ell }$$m, transverse momentum$$p_{\textrm{T}} (\ell \ell )$$pT(), and$$\varphi ^{*}_{\eta }$$φη. The$$\varphi ^{*}_{\eta }$$φηobservable, derived from angular measurements of the leptons and highly correlated with$$p_{\textrm{T}} (\ell \ell )$$pT(), is used to probe the low-$$p_{\textrm{T}} (\ell \ell )$$pT()region in a complementary way. Dilepton masses up to 1$$\,\text {Te\hspace{-.08em}V}$$TeVare investigated. Additionally, a measurement is performed requiring at least one jet in the final state. To benefit from partial cancellation of the systematic uncertainty, the ratios of the differential cross sections for various$$m_{\ell \ell }$$mranges to those in the Z mass peak interval are presented. The collected data correspond to an integrated luminosity of 36.3$$\,\text {fb}^{-1}$$fb-1of proton–proton collisions recorded with the CMS detector at the LHC at a centre-of-mass energy of 13$$\,\text {Te\hspace{-.08em}V}$$TeV. Measurements are compared with predictions based on perturbative quantum chromodynamics, including soft-gluon resummation.

     
    more » « less