skip to main content

Title: Sparse Recovery for Orthogonal Polynomial Transforms
In this paper we consider the following sparse recovery problem. We have query access to a vector ๐ฑ โˆˆ โ„^N such that xฬ‚ = ๐… ๐ฑ is k-sparse (or nearly k-sparse) for some orthogonal transform ๐…. The goal is to output an approximation (in an ๐“โ‚‚ sense) to xฬ‚ in sublinear time. This problem has been well-studied in the special case that ๐… is the Discrete Fourier Transform (DFT), and a long line of work has resulted in sparse Fast Fourier Transforms that run in time O(k โ‹… polylog N). However, for transforms ๐… other than the DFT (or closely related transforms like the Discrete Cosine Transform), the question is much less settled. In this paper we give sublinear-time algorithms - running in time poly(k log(N)) - for solving the sparse recovery problem for orthogonal transforms ๐… that arise from orthogonal polynomials. More precisely, our algorithm works for any ๐… that is an orthogonal polynomial transform derived from Jacobi polynomials. The Jacobi polynomials are a large class of classical orthogonal polynomials (and include Chebyshev and Legendre polynomials as special cases), and show up extensively in applications like numerical analysis and signal processing. One caveat of our work is that we more » require an assumption on the sparsity structure of the sparse vector, although we note that vectors with random support have this property with high probability. Our approach is to give a very general reduction from the k-sparse sparse recovery problem to the 1-sparse sparse recovery problem that holds for any flat orthogonal polynomial transform; then we solve this one-sparse recovery problem for transforms derived from Jacobi polynomials. Frequently, sparse FFT algorithms are described as implementing such a reduction; however, the technical details of such works are quite specific to the Fourier transform and moreover the actual implementations of these algorithms do not use the 1-sparse algorithm as a black box. In this work we give a reduction that works for a broad class of orthogonal polynomial families, and which uses any 1-sparse recovery algorithm as a black box. « less
Authors:
; ; ; ;
Award ID(s):
1763481
Publication Date:
NSF-PAR ID:
10214644
Journal Name:
Leibniz international proceedings in informatics
Volume:
168
Page Range or eLocation-ID:
58:1--58:16
ISSN:
1868-8969
Sponsoring Org:
National Science Foundation
More Like this
  1. Given a random n ร— n symmetric matrix ๐‘พ drawn from the Gaussian orthogonal ensemble (GOE), we consider the problem of certifying an upper bound on the maximum value of the quadratic form ๐’™^โŠค ๐‘พ ๐’™ over all vectors ๐’™ in a constraint set ๐’ฎ โŠ‚ โ„โฟ. For a certain class of normalized constraint sets we show that, conditional on a certain complexity-theoretic conjecture, no polynomial-time algorithm can certify a better upper bound than the largest eigenvalue of ๐‘พ. A notable special case included in our results is the hypercube ๐’ฎ = {ยฑ1/โˆšn}โฟ, which corresponds to the problem of certifying bounds on the Hamiltonian of the Sherrington-Kirkpatrick spin glass model from statistical physics. Our results suggest a striking gap between optimization and certification for this problem. Our proof proceeds in two steps. First, we give a reduction from the detection problem in the negatively-spiked Wishart model to the above certification problem. We then give evidence that this Wishart detection problem is computationally hard below the classical spectral threshold, by showing that no low-degree polynomial can (in expectation) distinguish the spiked and unspiked models. This method for predicting computational thresholds was proposed in a sequence of recent works on the sum-of-squaresmore »hierarchy, and is conjectured to be correct for a large class of problems. Our proof can be seen as constructing a distribution over symmetric matrices that appears computationally indistinguishable from the GOE, yet is supported on matrices whose maximum quadratic form over ๐’™ โˆˆ ๐’ฎ is much larger than that of a GOE matrix.« less
  2. Fast linear transforms are ubiquitous in machine learning, including the discrete Fourier transform, discrete cosine transform, and other structured transformations such as convolutions. All of these transforms can be represented by dense matrix-vector multiplication, yet each has a specialized and highly efficient (subquadratic) algorithm. We ask to what extent hand-crafting these algorithms and implementations is necessary, what structural prior they encode, and how much knowledge is required to automatically learn a fast algorithm for a provided structured transform. Motivated by a characterization of fast matrix-vector multiplication as products of sparse matrices, we introduce a parameterization of divide-and-conquer methods that is capable of representing a large class of transforms. This generic formulation can automatically learn an efficient algorithm for many important transforms; for example, it recovers the O(N logN) Cooley-Tukey FFT algorithm to machine precision, for dimensions N up to 1024. Furthermore, our method can be incorporated as a lightweight replacement of generic matrices in machine learning pipelines to learn efficient and compressible transformations. On a standard task of compressing a single hidden-layer network, our method exceeds the classification accuracy of unconstrained matrices on CIFAR-10 by 3.9 pointsโ€”the first time a structured approach has done soโ€”with 4X faster inference speed andmore »40X fewer parameters.« less
  3. We study the problem of estimating the value of sums of the form Spโ‰œโˆ‘(xip) when one has the ability to sample xiโ‰ฅ0 with probability proportional to its magnitude. When p=2, this problem is equivalent to estimating the selectivity of a self-join query in database systems when one can sample rows randomly. We also study the special case when {xi} is the degree sequence of a graph, which corresponds to counting the number of p-stars in a graph when one has the ability to sample edges randomly. Our algorithm for a (1ยฑฮต)-multiplicative approximation of Sp has query and time complexities O(mloglognฯต2S1/pp). Here, m=โˆ‘xi/2 is the number of edges in the graph, or equivalently, half the number of records in the database table. Similarly, n is the number of vertices in the graph and the number of unique values in the database table. We also provide tight lower bounds (up to polylogarithmic factors) in almost all cases, even when {xi} is a degree sequence and one is allowed to use the structure of the graph to try to get a better estimate. We are not aware of any prior lower bounds on the problem of join selectivity estimation. For the graph problem,more »prior work which assumed the ability to sample only vertices uniformly gave algorithms with matching lower bounds (Gonen et al. in SIAM J Comput 25:1365โ€“1411, 2011). With the ability to sample edges randomly, we show that one can achieve faster algorithms for approximating the number of star subgraphs, bypassing the lower bounds in this prior work. For example, in the regime where Spโ‰คn, and p=2, our upper bound is O~(n/S1/2p), in contrast to their ฮฉ(n/S1/3p) lower bound when no random edge queries are available. In addition, we consider the problem of counting the number of directed paths of length two when the graph is directed. This problem is equivalent to estimating the selectivity of a join query between two distinct tables. We prove that the general version of this problem cannot be solved in sublinear time. However, when the ratio between in-degree and out-degree is boundedโ€”or equivalently, when the ratio between the number of occurrences of values in the two columns being joined is boundedโ€”we give a sublinear time algorithm via a reduction to the undirected case.« less
  4. We consider the problem of designing sublinear time algorithms for estimating the cost of minimum] metric traveling salesman (TSP) tour. Specifically, given access to a n ร— n distance matrix D that specifies pairwise distances between n points, the goal is to estimate the TSP cost by performing only sublinear (in the size of D) queries. For the closely related problem of estimating the weight of a metric minimum spanning tree (MST), it is known that for any epsilon > 0, there exists an O^~(n/epsilon^O(1))-time algorithm that returns a (1+epsilon)-approximate estimate of the MST cost. This result immediately implies an O^~(n/epsilon^O(1)) time algorithm to estimate the TSP cost to within a (2 + epsilon) factor for any epsilon > 0. However, no o(n^2)-time algorithms are known to approximate metric TSP to a factor that is strictly better than 2. On the other hand, there were also no known barriers that rule out existence of (1 + epsilon)-approximate estimation algorithms for metric TSP with O^~ (n) time for any fixed epsilon > 0. In this paper, we make progress on both algorithms and lower bounds for estimating metric TSP cost. On the algorithmic side, we first consider the graphic TSP problemmore »where the metric D corresponds to shortest path distances in a connected unweighted undirected graph. We show that there exists an O^~(n) time algorithm that estimates the cost of graphic TSP to within a factor of (2 โˆ’ epsilon_0) for some epsilon_0 > 0. This is the first sublinear cost estimation algorithm for graphic TSP that achieves an approximation factor less than 2. We also consider another well-studied special case of metric TSP, namely, (1, 2)-TSP where all distances are either 1 or 2, and give an O^~(n ^ 1.5) time algorithm to estimate optimal cost to within a factor of 1.625. Our estimation algorithms for graphic TSP as well as for (1, 2)-TSP naturally lend themselves to O^~(n) space streaming algorithms that give an 11/6-approximation for graphic TSP and a 1.625-approximation for (1, 2)-TSP. These results motivate the natural question if analogously to metric MST, for any epsilon > 0, (1 + epsilon)-approximate estimates can be obtained for graphic TSP and (1, 2)-TSP using O^~ (n) queries. We answer this question in the negative โ€“ there exists an epsilon_0 > 0, such that any algorithm that estimates the cost of graphic TSP ((1, 2)-TSP) to within a (1 + epsilon_0)-factor, necessarily requires (n^2) queries. This lower bound result highlights a sharp separation between the metric MST and metric TSP problems. Similarly to many classical approximation algorithms for TSP, our sublinear time estimation algorithms utilize subroutines for estimating the size of a maximum matching in the underlying graph. We show that this is not merely an artifact of our approach, and that for any epsilon > 0, any algorithm that estimates the cost of graphic TSP or (1, 2)-TSP to within a (1 + epsilon)-factor, can also be used to estimate the size of a maximum matching in a bipartite graph to within an epsilon n additive error. This connection allows us to translate known lower bounds for matching size estimation in various models to similar lower bounds for metric TSP cost estimation.« less
  5. Multiple known algorithmic paradigms (backtracking, local search and the polynomial method) only yield a 2n(1โˆ’1/O(k)) time algorithm for k-SAT in the worst case. For this reason, it has been hypothesized that the worst-case k-SAT problem cannot be solved in 2n(1โˆ’f(k)/k) time for any unbounded function f. This hypothesis has been called the โ€œSuper-Strong ETHโ€, modeled after the ETH and the Strong ETH. We give two results on the Super-Strong ETH: 1. It has also been hypothesized that k-SAT is hard to solve for randomly chosen instances near the โ€œcritical thresholdโ€, where the clause-to-variable ratio is 2^kln2โˆ’ฮ˜(1). We give a randomized algorithm which refutes the Super-Strong ETH for the case of random k-SAT and planted k-SAT for any clause-to-variable ratio. For example, given any random k-SAT instance F with n variables and m clauses, our algorithm decides satisfiability for F in 2^n(1โˆ’ฮฉ(logk)/k) time, with high probability (over the choice of the formula and the randomness of the algorithm). It turns out that a well-known algorithm from the literature on SAT algorithms does the job: the PPZ algorithm of Paturi, Pudlรกk and Zane [17]. 2. The Unique k-SAT problem is the special case where there is at most one satisfying assignment. Improvingmore »prior reductions, we show that the Super-Strong ETHs for Unique k-SAT and k-SAT are equivalent. More precisely, we show the time complexities of Unique k-SAT and k-SAT are very tightly correlated: if Unique k-SAT is in 2^n(1โˆ’f(k)/k) time for an unbounded f, then k-SAT is in 2^n(1โˆ’f(k)(1โˆ’ฮต)/k) time for every ฮต>0.« less