Abstract Iterative thresholding algorithms seek to optimize a differentiable objective function over a sparsity or rank constraint by alternating between gradient steps that reduce the objective and thresholding steps that enforce the constraint. This work examines the choice of the thresholding operator and asks whether it is possible to achieve stronger guarantees than what is possible with hard thresholding. We develop the notion of relative concavity of a thresholding operator, a quantity that characterizes the worst-case convergence performance of any thresholding operator on the target optimization problem. Surprisingly, we find that commonly used thresholding operators, such as hard thresholding and soft thresholding, are suboptimal in terms of worst-case convergence guarantees. Instead, a general class of thresholding operators, lying between hard thresholding and soft thresholding, is shown to be optimal with the strongest possible convergence guarantee among all thresholding operators. Examples of this general class includes $$\ell _q$$ thresholding with appropriate choices of $$q$$ and a newly defined reciprocal thresholding operator. We also investigate the implications of the improved optimization guarantee in the statistical setting of sparse linear regression and show that this new class of thresholding operators attain the optimal rate for computationally efficient estimators, matching the Lasso.
more »
« less
Jointly low-rank and bisparse recovery: Questions and partial answers
We investigate the problem of recovering jointly [Formula: see text]-rank and [Formula: see text]-bisparse matrices from as few linear measurements as possible, considering arbitrary measurements as well as rank-one measurements. In both cases, we show that [Formula: see text] measurements make the recovery possible in theory, meaning via a nonpractical algorithm. In case of arbitrary measurements, we investigate the possibility of achieving practical recovery via an iterative-hard-thresholding algorithm when [Formula: see text] for some exponent [Formula: see text]. We show that this is feasible for [Formula: see text], and that the proposed analysis cannot cover the case [Formula: see text]. The precise value of the optimal exponent [Formula: see text] is the object of a question, raised but unresolved in this paper, about head projections for the jointly low-rank and bisparse structure. Some related questions are partially answered in passing. For rank-one measurements, we suggest on arcane grounds an iterative-hard-thresholding algorithm modified to exploit the nonstandard restricted isometry property obeyed by this type of measurements.
more »
« less
- PAR ID:
- 10149350
- Date Published:
- Journal Name:
- Analysis and Applications
- Volume:
- 18
- Issue:
- 01
- ISSN:
- 0219-5305
- Page Range / eLocation ID:
- 25 to 48
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We study optimal design problems in which the goal is to choose a set of linear measurements to obtain the most accurate estimate of an unknown vector. We study the [Formula: see text]-optimal design variant where the objective is to minimize the average variance of the error in the maximum likelihood estimate of the vector being measured. We introduce the proportional volume sampling algorithm to obtain nearly optimal bounds in the asymptotic regime when the number [Formula: see text] of measurements made is significantly larger than the dimension [Formula: see text] and obtain the first approximation algorithms whose approximation factor does not degrade with the number of possible measurements when [Formula: see text] is small. The algorithm also gives approximation guarantees for other optimal design objectives such as [Formula: see text]-optimality and the generalized ratio objective, matching or improving the previously best-known results. We further show that bounds similar to ours cannot be obtained for [Formula: see text]-optimal design and that [Formula: see text]-optimal design is NP-hard to approximate within a fixed constant when [Formula: see text].more » « less
-
The matrix sensing problem is an important low-rank optimization problem that has found a wide range of applications, such as matrix completion, phase synchornization/retrieval, robust principal component analysis (PCA), and power system state estimation. In this work, we focus on the general matrix sensing problem with linear measurements that are corrupted by random noise. We investigate the scenario where the search rank r is equal to the true rank [Formula: see text] of the unknown ground truth (the exact parametrized case), as well as the scenario where r is greater than [Formula: see text] (the overparametrized case). We quantify the role of the restricted isometry property (RIP) in shaping the landscape of the nonconvex factorized formulation and assisting with the success of local search algorithms. First, we develop a global guarantee on the maximum distance between an arbitrary local minimizer of the nonconvex problem and the ground truth under the assumption that the RIP constant is smaller than [Formula: see text]. We then present a local guarantee for problems with an arbitrary RIP constant, which states that any local minimizer is either considerably close to the ground truth or far away from it. More importantly, we prove that this noisy, overparametrized problem exhibits the strict saddle property, which leads to the global convergence of perturbed gradient descent algorithm in polynomial time. The results of this work provide a comprehensive understanding of the geometric landscape of the matrix sensing problem in the noisy and overparametrized regime. Funding: This work was supported by grants from the National Science Foundation, Office of Naval Research, Air Force Office of Scientific Research, and Army Research Office.more » « less
-
Fix a weakly minimal (i.e. superstable [Formula: see text]-rank [Formula: see text]) structure [Formula: see text]. Let [Formula: see text] be an expansion by constants for an elementary substructure, and let [Formula: see text] be an arbitrary subset of the universe [Formula: see text]. We show that all formulas in the expansion [Formula: see text] are equivalent to bounded formulas, and so [Formula: see text] is stable (or NIP) if and only if the [Formula: see text]-induced structure [Formula: see text] on [Formula: see text] is stable (or NIP). We then restrict to the case that [Formula: see text] is a pure abelian group with a weakly minimal theory, and [Formula: see text] is mutually algebraic (equivalently, weakly minimal with trivial forking). This setting encompasses most of the recent research on stable expansions of [Formula: see text]. Using various characterizations of mutual algebraicity, we give new examples of stable structures of the form [Formula: see text]. Most notably, we show that if [Formula: see text] is a weakly minimal additive subgroup of the algebraic numbers, [Formula: see text] is enumerated by a homogeneous linear recurrence relation with algebraic coefficients, and no repeated root of the characteristic polynomial of [Formula: see text] is a root of unity, then [Formula: see text] is superstable for any [Formula: see text].more » « less
-
We study the problem of covering barrier points by mobile sensors. Each sensor is represented by a point in the plane with the same covering range [Formula: see text] so that any point within distance [Formula: see text] from the sensor can be covered by the sensor. Given a set [Formula: see text] of [Formula: see text] points (called “barrier points”) and a set [Formula: see text] of [Formula: see text] points (representing the “sensors”) in the plane, the problem is to move the sensors so that each barrier point is covered by at least one sensor and the maximum movement of all sensors is minimized. The problem is NP-hard. In this paper, we consider two line-constrained variations of the problem and present efficient algorithms that improve the previous work. In the first problem, all sensors are given on a line [Formula: see text] and are required to move on [Formula: see text] only while the barrier points can be anywhere in the plane. We propose an [Formula: see text] time algorithm for the problem. We also consider the weighted case where each sensor has a weight; we give an [Formula: see text] time algorithm for this case. In the second problem, all barrier points are on [Formula: see text] while all sensors are in the plane but are required to move onto [Formula: see text] to cover all barrier points. We also solve the weighted case in [Formula: see text] time.more » « less
An official website of the United States government

