We design new polynomialtime algorithms for recovering planted cliques in the semirandom graph model introduced by Feige and Kilian 2001. The previous best algorithms for this model succeed if the planted clique has size at least n2/3 in a graph with n vertices (Mehta, Mckenzie, Trevisan 2019 and Charikar, Steinhardt, Valiant 2017). Our algorithms work for plantedclique sizes approaching n1/2  the informationtheoretic threshold in the semirandom model (Steinhardt 2017) and a conjectured computational threshold even in the easier fullyrandom model. This result comes close to resolving open questions by Feige 2019 and Steinhardt 2017.
Our algorithms are based on higher constant degree sumofsquares relaxation and rely on a new conceptual connection that translates certificates of upper bounds on biclique numbers in unbalanced bipartite ErdősRényi random graphs into algorithms for semirandom planted clique. The use of a higherconstant degree sumofsquares is essential in our setting: we prove a lower bound on the basic SDP for certifying bicliques that shows that the basic SDP cannot succeed for planted cliques of size k=o(n2/3). We also provide some evidence that the informationcomputation tradeoff of our current algorithms may be inherent by proving an averagecase lower bound for unbalanced bicliques in the lowdegreepolynomials model.
more »
« less
Algorithmic Thresholds for Refuting Random Polynomial Systems
Consider a system of m polynomial equations {pi(x)=bi}i≤m of degree D≥2 in ndimensional variable x∈ℝn such that each coefficient of every pi and bis are chosen at random and independently from some continuous distribution. We study the basic question of determining the smallest m  the algorithmic threshold  for which efficient algorithms can find refutations (i.e. certificates of unsatisfiability) for such systems. This setting generalizes problems such as refuting random SAT instances, lowrank matrix sensing and certifying pseudorandomness of Goldreich's candidate generators and generalizations.
We show that for every d∈ℕ, the (n+m)O(d)time canonical sumofsquares (SoS) relaxation refutes such a system with high probability whenever m≥O(n)⋅(nd)D−1. We prove a lower bound in the restricted lowdegree polynomial model of computation which suggests that this tradeoff between SoS degree and the number of equations is nearly tight for all d. We also confirm the predictions of this lower bound in a limited setting by showing a lower bound on the canonical degree4 sumofsquares relaxation for refuting random quadratic polynomials. Together, our results provide evidence for an algorithmic threshold for the problem at m≳O˜(n)⋅n(1−δ)(D−1) for 2nδtime algorithms for all δ.
more »
« less
 Award ID(s):
 2047933
 NSFPAR ID:
 10314967
 Date Published:
 Journal Name:
 Proceedings of the annual ACMSIAM symposium on discrete algorithms
 ISSN:
 21601445
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


Given a graph and an integer k, Densest kSubgraph is the algorithmic task of finding the subgraph on k vertices with the maximum number of edges. This is a fundamental problem that has been subject to intense study for decades, with applications spanning a wide variety of fields. The stateoftheart algorithm is an O(n^{1/4+ϵ})factor approximation (for any ϵ>0) due to Bhaskara et al. [STOC '10]. Moreover, the socalled logdensity framework predicts that this is optimal, i.e. it is impossible for an efficient algorithm to achieve an O(n^{1/4−ϵ})factor approximation. In the average case, Densest kSubgraph is a prototypical noisy inference task which is conjectured to exhibit a statisticalcomputational gap. In this work, we provide the strongest evidence yet of hardness for Densest kSubgraph by showing matching lower bounds against the powerful SumofSquares (SoS) algorithm, a metaalgorithm based on convex programming that achieves stateofart algorithmic guarantees for many optimization and inference problems. For k ≤ n^1/2, we obtain a degree n^δ SoS lower bound for the hard regime as predicted by the logdensity framework. To show this, we utilize the modern framework for proving SoS lower bounds on averagecase problems pioneered by Barak et al. [FOCS '16]. A key issue is that small denserthanaverage subgraphs in the input will greatly affect the value of the candidate pseudoexpectation operator around the subgraph. To handle this challenge, we devise a novel matrix factorization scheme based on the positive minimum vertex separator. We then prove an intersection tradeoff lemma to show that the error terms when using this separator are indeed small.more » « less

Clustering is a fundamental primitive in unsupervised learning which gives rise to a rich class of computationallychallenging inference tasks. In this work, we focus on the canonical task of clustering ddimensional Gaussian mixtures with unknown (and possibly degenerate) covariance. Recent works (Ghosh et al. ’20; Mao, Wein ’21; Davis, Diaz, Wang ’21) have established lower bounds against the class of lowdegree polynomial methods and the sumofsquares (SoS) hierarchy for recovering certain hidden structures planted in Gaussian clustering instances. Prior work on many similar inference tasks portends that such lower bounds strongly suggest the presence of an inherent statisticaltocomputational gap for clustering, that is, a parameter regime where the clustering task is statistically possible but no polynomialtime algorithm succeeds. One special case of the clustering task we consider is equivalent to the problem of finding a planted hypercube vector in an otherwise random subspace. We show that, perhaps surprisingly, this particular clustering model does not exhibit a statisticaltocomputational gap, despite the aforementioned lowdegree and SoS lower bounds. To achieve this, we give an algorithm based on Lenstra–Lenstra–Lovász lattice basis reduction which achieves the statisticallyoptimal sample complexity of d + 1 samples. This result extends the class of problems whose conjectured statisticaltocomputational gaps can be “closed” by “brittle” polynomialtime algorithms, highlighting the crucial but subtle role of noise in the onset of statisticaltocomputational gaps.more » « less

A graph spanner is a fundamental graph structure that faithfully preserves the pairwise distances in the input graph up to a small multiplicative stretch. The common objective in the computation of spanners is to achieve the bestknown existential sizestretch tradeoff efficiently. Classical models and algorithmic analysis of graph spanners essentially assume that the algorithm can read the input graph, construct the desired spanner, and write the answer to the output tape. However, when considering massive graphs containing millions or even billions of nodes not only the input graph, but also the output spanner might be too large for a single processor to store. To tackle this challenge, we initiate the study of local computation algorithms (LCAs) for graph spanners in general graphs, where the algorithm should locally decide whether a given edge (u,v)∈E belongs to the output spanner. Such LCAs give the user the `illusion' that a specific sparse spanner for the graph is maintained, without ever fully computing it. We present the following results: For general nvertex graphs and r∈{2,3}, there exists an LCA for (2r−1)spanners with O˜(n1+1/r) edges and sublinear probe complexity of O˜(n1−1/2r). These size/stretch tradeoffs are best possible (up to polylogarithmic factors). For every k≥1 and nvertex graph with maximum degree Δ, there exists an LCA for O(k2) spanners with O˜(n1+1/k) edges, probe complexity of O˜(Δ4n2/3), and random seed of size polylog(n). This improves upon, and extends the work of [LenzenLevi, 2018]. We also complement our results by providing a polynomial lower bound on the probe complexity of LCAs for graph spanners that holds even for the simpler task of computing a sparse connected subgraph with o(m) edges.more » « less

null (Ed.)The Gilbert–Varshamov bound nonconstructively establishes the existence of binary codes of distance 1/2−є/2 and rate Ω(є2). In a breakthrough result, TaShma [STOC 2017] constructed the first explicit family of nearly optimal binary codes with distance 1/2−є/2 and rate Ω(є2+α), where α → 0 as є → 0. Moreover, the codes in TaShma’s construction are єbalanced, where the distance between distinct codewords is not only bounded from below by 1/2−є/2, but also from above by 1/2+є/2. Polynomial time decoding algorithms for (a slight modification of) TaShma’s codes appeared in [FOCS 2020], and were based on the SumofSquares (SoS) semidefinite programming hierarchy. The running times for these algorithms were of the form NOα(1) for unique decoding, and NOє,α(1) for the setting of “gentle list decoding”, with large exponents of N even when α is a fixed constant. We derive new algorithms for both these tasks, running in time Õє(N). Our algorithms also apply to the general setting of decoding directsum codes. Our algorithms follow from new structural and algorithmic results for collections of ktuples (ordered hypergraphs) possessing a “structured expansion” property, which we call splittability. This property was previously identified and used in the analysis of SoSbased decoding and constraint satisfaction algorithms, and is also known to be satisfied by TaShma’s code construction. We obtain a new weak regularity decomposition for (possibly sparse) splittable collections W ⊆ [n]k, similar to the regularity decomposition for dense structures by Frieze and Kannan [FOCS 1996]. These decompositions are also computable in nearlinear time Õ(W ), and form a key component of our algorithmic results.more » « less