Locally Decodable Codes (LDCs) are errorcorrecting codes for which individual message symbols can be quickly recovered despite errors in the codeword. LDCs for Hamming errors have been studied extensively in the past few decades, where a major goal is to understand the amount of redundancy that is necessary and sufficient to decode from large amounts of error, with small query complexity. Despite exciting progress, we still don't have satisfactory answers in several important parameter regimes. For example, in the case of 3query LDCs, the gap between existing constructions and lower bounds is superpolynomial in the message length. In this work we study LDCs for insertion and deletion errors, called Insdel LDCs. Their study was initiated by Ostrovsky and PaskinCherniavsky (Information Theoretic Security, 2015), who gave a reduction from Hamming LDCs to Insdel LDCs with a small blowup in the code parameters. On the other hand, the only known lower bounds for Insdel LDCs come from those for Hamming LDCs, thus there is no separation between them. Here we prove new, strong lower bounds for the existence of Insdel LDCs. In particular, we show that 2query linear Insdel LDCs do not exist, and give an exponential lower bound for the length of all qquery Insdel LDCs with constant q. For q ≥ 3 our bounds are exponential in the existing lower bounds for Hamming LDCs. Furthermore, our exponential lower bounds continue to hold for adaptive decoders, and even in privatekey settings where the encoder and decoder share secret randomness. This exhibits a strict separation between Hamming LDCs and Insdel LDCs. Our strong lower bounds also hold for the related notion of Insdel LCCs (except in the privatekey setting), due to an analogue to the Insdel notions of a reduction from Hamming LCCs to LDCs. Our techniques are based on a delicate design and analysis of hard distributions of insertion and deletion errors, which depart significantly from typical techniques used in analyzing Hamming LDCs.
more »
« less
This content will become publicly available on July 1, 2024
A NearCubic Lower Bound for 3Query Locally Decodable Codes from Semirandom CSP Refutation
A code C ∶ {0,1}k → {0,1}n is a qlocally decodable code (qLDC) if one can recover any chosen bit bi of the message b ∈ {0,1}k with good confidence by randomly querying the encoding x = C(b) on at most q coordinates. Existing constructions of 2LDCs achieve n = exp(O(k)), and lower bounds show that this is in fact tight. However, when q = 3, far less is known: the best constructions achieve n = exp(ko(1)), while the best known results only show a quadratic lower bound n ≥ Ω(k2/log(k)) on the blocklength.
In this paper, we prove a nearcubic lower bound of n ≥ Ω(k3/log6(k)) on the blocklength of 3query LDCs. This improves on the best known prior works by a polynomial factor in k. Our proof relies on a new connection between LDCs and refuting constraint satisfaction problems with limited randomness. Our quantitative improvement builds on the new techniques for refuting semirandom instances of CSPs and, in particular, relies on bounding the spectral norm of appropriate Kikuchi matrices.
more »
« less
 Award ID(s):
 2211971
 NSFPAR ID:
 10435079
 Date Published:
 Journal Name:
 ACM Symposium on Theory of Computing, STOC
 Issue:
 2023
 Page Range / eLocation ID:
 1438 to 1448
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


Dictionaries remain the most well studied class of data structures. A dictionary supports insertions, deletions, membership queries, and usually successor, predecessor, and extractmin. In a RAM, all such operations take O(log n) time on n elements. Dictionaries are often crossreferenced as follows. Consider a set of tuples {〈ai,bi,ci…〉}. A database might include more than one dictionary on such a set, for example, one indexed on the a ‘s, another on the b‘s, and so on. Once again, in a RAM, inserting into a set of L crossreferenced dictionaries takes O(L log n) time, as does deleting. The situation is more interesting in external memory. On a Disk Access Machine (DAM), Btrees achieve O(logB N) I/Os for insertions and deletions on a single dictionary and Kelement range queries take optimal O(logB N + K/B) I/Os. These bounds are also achievable by a Btree on crossreferenced dictionaries, with a slowdown of an L factor on insertion and deletions. In recent years, both the theory and practice of external memory dictionaries has been revolutionized by write optimization techniques. A dictionary is write optimized if it is close to a Btree for query time while beating Btrees on insertions. The best (and optimal) dictionaries achieve a substantially improved insertion and deletion cost of amortized I/Os on a single dictionary while maintaining optimal O(log1+B∊ N + K/B) I/O range queries. Although write optimization still helps for insertions into crossreferenced dictionaries, its value for deletions would seem to be greatly reduced. A deletion into a cross referenced dictionary only specifies a key a. It seems to be necessary to look up the associated values b, c … in order to delete them from the other dictionaries. This takes Ω(logB N) I/Os, well above the perdictionary writeoptimization budget of So the total deletion cost is In short, for deletions, write optimization offers an advantage over Btrees in that L multiplies a lower order term, but when L = 2, write optimization seems to offer no asymptotic advantage over Btrees. That is, no known query optimal solution for pairs of crossreferenced dictionaries seem to beat Btrees for deletions. In this paper, we show a lower bound establishing that a pair of crossreferenced dictionaries that are optimal for range queries and that supports deletions cannot match the write optimization bound available to insertonly dictionaries. This result thus establishes a limit to the applicability of writeoptimization techniques on which many new databases and file systems are based. Read More: http://epubs.siam.org/doi/10.1137/1.9781611974782.99more » « less

TaShma, Amnon (Ed.)Locally Decodable Codes (LDCs) are errorcorrecting codes C:Σⁿ → Σ^m, encoding messages in Σⁿ to codewords in Σ^m, with superfast decoding algorithms. They are important mathematical objects in many areas of theoretical computer science, yet the best constructions so far have codeword length m that is superpolynomial in n, for codes with constant query complexity and constant alphabet size. In a very surprising result, BenSasson, Goldreich, Harsha, Sudan, and Vadhan (SICOMP 2006) show how to construct a relaxed version of LDCs (RLDCs) with constant query complexity and almost linear codeword length over the binary alphabet, and used them to obtain significantlyimproved constructions of Probabilistically Checkable Proofs. In this work, we study RLDCs in the standard Hammingerror setting, and introduce their variants in the insertion and deletion (Insdel) error setting. Standard LDCs for Insdel errors were first studied by Ostrovsky and PaskinCherniavsky (Information Theoretic Security, 2015), and are further motivated by recent advances in DNA random access biotechnologies. Our first result is an exponential lower bound on the length of Hamming RLDCs making 2 queries (even adaptively), over the binary alphabet. This answers a question explicitly raised by Gur and Lachish (SICOMP 2021) and is the first exponential lower bound for RLDCs. Combined with the results of BenSasson et al., our result exhibits a "phasetransition"type behavior on the codeword length for some constantquery complexity. We achieve these lower bounds via a transformation of RLDCs to standard Hamming LDCs, using a careful analysis of restrictions of message bits that fix codeword bits. We further define two variants of RLDCs in the Insdelerror setting, a weak and a strong version. On the one hand, we construct weak Insdel RLDCs with almost linear codeword length and constant query complexity, matching the parameters of the Hamming variants. On the other hand, we prove exponential lower bounds for strong Insdel RLDCs. These results demonstrate that, while these variants are equivalent in the Hamming setting, they are significantly different in the insdel setting. Our results also prove a strict separation between Hamming RLDCs and Insdel RLDCs.more » « less

The best known solutions for kmessage broadcast in dynamic networks of size n require Ω(nk) rounds. In this paper, we see if these bounds can be improved by smoothed analysis. To do so, we study perhaps the most natural randomized algorithm for disseminating tokens in this setting: at every time step, choose a token to broadcast randomly from the set of tokens you know. We show that with even a small amount of smoothing (i.e., one random edge added per round), this natural strategy solves kmessage broadcast in Õ(n+k³) rounds, with high probability, beating the best known bounds for k = o(√n) and matching the Ω(n+k) lower bound for static networks for k = O(n^{1/3}) (ignoring logarithmic factors). In fact, the main result we show is even stronger and more general: given 𝓁smoothing (i.e., 𝓁 random edges added per round), this simple strategy terminates in O(kn^{2/3}log^{1/3}(n)𝓁^{1/3}) rounds. We then prove this analysis close to tight with an almostmatching lower bound. To better understand the impact of smoothing on information spreading, we next turn our attention to static networks, proving a tight bound of Õ(k√n) rounds to solve kmessage broadcast, which is better than what our strategy can achieve in the dynamic setting. This confirms the intuition that although smoothed analysis reduces the difficulties induced by changing graph structures, it does not eliminate them altogether. Finally, we apply tools developed to support our smoothed analysis to prove an optimal result for kmessage broadcast in socalled wellmixed networks in the absence of smoothing. By comparing this result to an existing lower bound for wellmixed networks, we establish a formal separation between oblivious and strongly adaptive adversaries with respect to wellmixed token spreading, partially resolving an open question on the impact of adversary strength on the kmessage broadcast problem.more » « less

We present new constructions of pseudorandom generators (PRGs) for two of the most widely studied nonuniform circuit classes in complexity theory. Our main result is a construction of the first nontrivial PRG for linear threshold (LTF) circuits of arbitrary constant depth and superlinear size. This PRG fools circuits with depth d∈N and n1+δ wires, where δ=2−O(d) , using seed length O(n1−δ) and with error 2−nδ . This tightly matches the best known lower bounds for this circuit class. As a consequence of our result, all the known hardness for LTF circuits has now effectively been translated into pseudorandomness. This brings the extensive effort in the last decade to construct PRGs and deterministic circuitanalysis algorithms for this class to the point where any subsequent improvement would yield breakthrough lower bounds. Our second contribution is a PRG for De Morgan formulas of size s whose seed length is s1/3+o(1)⋅polylog(1/ϵ) for error ϵ . In particular, our PRG can fool formulas of subcubic size s=n3−Ω(1) with an exponentially small error ϵ=exp(−nΩ(1)) . This significantly improves the inversepolynomial error of the previous stateoftheart for such formulas by Impagliazzo, Meka, and Zuckerman (FOCS 2012, JACM 2019), and again tightly matches the best currentlyknown lower bounds for this class. In both settings, a key ingredient in our constructions is a pseudorandom restriction procedure that has tiny failure probability, but simplifies the function to a nonnatural “hybrid computational model” that combines several computational models.more » « less