skip to main content

This content will become publicly available on July 1, 2024

Title: A Near-Cubic Lower Bound for 3-Query Locally Decodable Codes from Semirandom CSP Refutation
A code C ∶ {0,1}k → {0,1}n is a q-locally decodable code (q-LDC) if one can recover any chosen bit bi of the message b ∈ {0,1}k with good confidence by randomly querying the encoding x = C(b) on at most q coordinates. Existing constructions of 2-LDCs achieve n = exp(O(k)), and lower bounds show that this is in fact tight. However, when q = 3, far less is known: the best constructions achieve n = exp(ko(1)), while the best known results only show a quadratic lower bound n ≥ Ω(k2/log(k)) on the blocklength. In this paper, we prove a near-cubic lower bound of n ≥ Ω(k3/log6(k)) on the blocklength of 3-query LDCs. This improves on the best known prior works by a polynomial factor in k. Our proof relies on a new connection between LDCs and refuting constraint satisfaction problems with limited randomness. Our quantitative improvement builds on the new techniques for refuting semirandom instances of CSPs and, in particular, relies on bounding the spectral norm of appropriate Kikuchi matrices.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
ACM Symposium on Theory of Computing, STOC
Page Range / eLocation ID:
1438 to 1448
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Locally Decodable Codes (LDCs) are error-correcting codes for which individual message symbols can be quickly recovered despite errors in the codeword. LDCs for Hamming errors have been studied extensively in the past few decades, where a major goal is to understand the amount of redundancy that is necessary and sufficient to decode from large amounts of error, with small query complexity. Despite exciting progress, we still don't have satisfactory answers in several important parameter regimes. For example, in the case of 3-query LDCs, the gap between existing constructions and lower bounds is superpolynomial in the message length. In this work we study LDCs for insertion and deletion errors, called Insdel LDCs. Their study was initiated by Ostrovsky and Paskin-Cherniavsky (Information Theoretic Security, 2015), who gave a reduction from Hamming LDCs to Insdel LDCs with a small blowup in the code parameters. On the other hand, the only known lower bounds for Insdel LDCs come from those for Hamming LDCs, thus there is no separation between them. Here we prove new, strong lower bounds for the existence of Insdel LDCs. In particular, we show that 2-query linear Insdel LDCs do not exist, and give an exponential lower bound for the length of all q-query Insdel LDCs with constant q. For q ≥ 3 our bounds are exponential in the existing lower bounds for Hamming LDCs. Furthermore, our exponential lower bounds continue to hold for adaptive decoders, and even in private-key settings where the encoder and decoder share secret randomness. This exhibits a strict separation between Hamming LDCs and Insdel LDCs. Our strong lower bounds also hold for the related notion of Insdel LCCs (except in the private-key setting), due to an analogue to the Insdel notions of a reduction from Hamming LCCs to LDCs. Our techniques are based on a delicate design and analysis of hard distributions of insertion and deletion errors, which depart significantly from typical techniques used in analyzing Hamming LDCs. 
    more » « less
  2. Dictionaries remain the most well studied class of data structures. A dictionary supports insertions, deletions, membership queries, and usually successor, predecessor, and extract-min. In a RAM, all such operations take O(log n) time on n elements. Dictionaries are often cross-referenced as follows. Consider a set of tuples {〈ai,bi,ci…〉}. A database might include more than one dictionary on such a set, for example, one indexed on the a ‘s, another on the b‘s, and so on. Once again, in a RAM, inserting into a set of L cross-referenced dictionaries takes O(L log n) time, as does deleting. The situation is more interesting in external memory. On a Disk Access Machine (DAM), B-trees achieve O(logB N) I/Os for insertions and deletions on a single dictionary and K-element range queries take optimal O(logB N + K/B) I/Os. These bounds are also achievable by a B-tree on cross-referenced dictionaries, with a slowdown of an L factor on insertion and deletions. In recent years, both the theory and practice of external- memory dictionaries has been revolutionized by write- optimization techniques. A dictionary is write optimized if it is close to a B-tree for query time while beating B-trees on insertions. The best (and optimal) dictionaries achieve a substantially improved insertion and deletion cost of amortized I/Os on a single dictionary while maintaining optimal O(log1+B∊ N + K/B)- I/O range queries. Although write optimization still helps for insertions into cross-referenced dictionaries, its value for deletions would seem to be greatly reduced. A deletion into a cross- referenced dictionary only specifies a key a. It seems to be necessary to look up the associated values b, c … in order to delete them from the other dictionaries. This takes Ω(logB N) I/Os, well above the per-dictionary write-optimization budget of So the total deletion cost is In short, for deletions, write optimization offers an advantage over B-trees in that L multiplies a lower order term, but when L = 2, write optimization seems to offer no asymptotic advantage over B-trees. That is, no known query- optimal solution for pairs of cross-referenced dictionaries seem to beat B-trees for deletions. In this paper, we show a lower bound establishing that a pair of cross-referenced dictionaries that are optimal for range queries and that supports deletions cannot match the write optimization bound available to insert-only dictionaries. This result thus establishes a limit to the applicability of write-optimization techniques on which many new databases and file systems are based. Read More: 
    more » « less
  3. Ta-Shma, Amnon (Ed.)
    Locally Decodable Codes (LDCs) are error-correcting codes C:Σⁿ → Σ^m, encoding messages in Σⁿ to codewords in Σ^m, with super-fast decoding algorithms. They are important mathematical objects in many areas of theoretical computer science, yet the best constructions so far have codeword length m that is super-polynomial in n, for codes with constant query complexity and constant alphabet size. In a very surprising result, Ben-Sasson, Goldreich, Harsha, Sudan, and Vadhan (SICOMP 2006) show how to construct a relaxed version of LDCs (RLDCs) with constant query complexity and almost linear codeword length over the binary alphabet, and used them to obtain significantly-improved constructions of Probabilistically Checkable Proofs. In this work, we study RLDCs in the standard Hamming-error setting, and introduce their variants in the insertion and deletion (Insdel) error setting. Standard LDCs for Insdel errors were first studied by Ostrovsky and Paskin-Cherniavsky (Information Theoretic Security, 2015), and are further motivated by recent advances in DNA random access bio-technologies. Our first result is an exponential lower bound on the length of Hamming RLDCs making 2 queries (even adaptively), over the binary alphabet. This answers a question explicitly raised by Gur and Lachish (SICOMP 2021) and is the first exponential lower bound for RLDCs. Combined with the results of Ben-Sasson et al., our result exhibits a "phase-transition"-type behavior on the codeword length for some constant-query complexity. We achieve these lower bounds via a transformation of RLDCs to standard Hamming LDCs, using a careful analysis of restrictions of message bits that fix codeword bits. We further define two variants of RLDCs in the Insdel-error setting, a weak and a strong version. On the one hand, we construct weak Insdel RLDCs with almost linear codeword length and constant query complexity, matching the parameters of the Hamming variants. On the other hand, we prove exponential lower bounds for strong Insdel RLDCs. These results demonstrate that, while these variants are equivalent in the Hamming setting, they are significantly different in the insdel setting. Our results also prove a strict separation between Hamming RLDCs and Insdel RLDCs. 
    more » « less
  4. We present new constructions of pseudorandom generators (PRGs) for two of the most widely studied non-uniform circuit classes in complexity theory. Our main result is a construction of the first non-trivial PRG for linear threshold (LTF) circuits of arbitrary constant depth and super-linear size. This PRG fools circuits with depth d∈N and n1+δ wires, where δ=2−O(d) , using seed length O(n1−δ) and with error 2−nδ . This tightly matches the best known lower bounds for this circuit class. As a consequence of our result, all the known hardness for LTF circuits has now effectively been translated into pseudorandomness. This brings the extensive effort in the last decade to construct PRGs and deterministic circuit-analysis algorithms for this class to the point where any subsequent improvement would yield breakthrough lower bounds. Our second contribution is a PRG for De Morgan formulas of size s whose seed length is s1/3+o(1)⋅polylog(1/ϵ) for error ϵ . In particular, our PRG can fool formulas of sub-cubic size s=n3−Ω(1) with an exponentially small error ϵ=exp(−nΩ(1)) . This significantly improves the inverse-polynomial error of the previous state-of-the-art for such formulas by Impagliazzo, Meka, and Zuckerman (FOCS 2012, JACM 2019), and again tightly matches the best currently-known lower bounds for this class. In both settings, a key ingredient in our constructions is a pseudorandom restriction procedure that has tiny failure probability, but simplifies the function to a non-natural “hybrid computational model” that combines several computational models. 
    more » « less
  5. The best known solutions for k-message broadcast in dynamic networks of size n require Ω(nk) rounds. In this paper, we see if these bounds can be improved by smoothed analysis. To do so, we study perhaps the most natural randomized algorithm for disseminating tokens in this setting: at every time step, choose a token to broadcast randomly from the set of tokens you know. We show that with even a small amount of smoothing (i.e., one random edge added per round), this natural strategy solves k-message broadcast in Õ(n+k³) rounds, with high probability, beating the best known bounds for k = o(√n) and matching the Ω(n+k) lower bound for static networks for k = O(n^{1/3}) (ignoring logarithmic factors). In fact, the main result we show is even stronger and more general: given 𝓁-smoothing (i.e., 𝓁 random edges added per round), this simple strategy terminates in O(kn^{2/3}log^{1/3}(n)𝓁^{-1/3}) rounds. We then prove this analysis close to tight with an almost-matching lower bound. To better understand the impact of smoothing on information spreading, we next turn our attention to static networks, proving a tight bound of Õ(k√n) rounds to solve k-message broadcast, which is better than what our strategy can achieve in the dynamic setting. This confirms the intuition that although smoothed analysis reduces the difficulties induced by changing graph structures, it does not eliminate them altogether. Finally, we apply tools developed to support our smoothed analysis to prove an optimal result for k-message broadcast in so-called well-mixed networks in the absence of smoothing. By comparing this result to an existing lower bound for well-mixed networks, we establish a formal separation between oblivious and strongly adaptive adversaries with respect to well-mixed token spreading, partially resolving an open question on the impact of adversary strength on the k-message broadcast problem. 
    more » « less