skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Improved Lower Bounds for 3-Query Matching Vector Codes
A Matching Vector (MV) family modulo a positive integer m ≥ 2 is a pair of ordered lists U = (u_1, ⋯, u_K) and V = (v_1, ⋯, v_K) where u_i, v_j ∈ ℤ_m^n with the following property: for any i ∈ [K], the inner product ⟨u_i, v_i⟩ = 0 mod m, and for any i ≠ j, ⟨u_i, v_j⟩ ≠ 0 mod m. An MV family is called r-restricted if inner products ⟨u_i, v_j⟩, for all i,j, take at most r different values. The r-restricted MV families are extremely important since the only known construction of constant-query subexponential locally decodable codes (LDCs) are based on them. Such LDCs constructed via matching vector families are called matching vector codes. Let MV(m,n) (respectively MV(m, n, r)) denote the largest K such that there exists an MV family (respectively r-restricted MV family) of size K in ℤ_m^n. Such a MV family can be transformed in a black-box manner to a good r-query locally decodable code taking messages of length K to codewords of length N = m^n. For small prime m, an almost tight bound MV(m,n) ≤ O(m^{n/2}) was first shown by Dvir, Gopalan, Yekhanin (FOCS'10, SICOMP'11), while for general m, the same paper established an upper bound of O(m^{n-1+o_m(1)}), with o_m(1) denoting a function that goes to zero when m grows. For any arbitrary constant r ≥ 3 and composite m, the best upper bound till date on MV(m,n,r) is O(m^{n/2}), is due to Bhowmick, Dvir and Lovett (STOC'13, SICOMP'14).In a breakthrough work, Alrabiah, Guruswami, Kothari and Manohar (STOC'23) implicitly improve this bound for 3-restricted families to MV(m, n, 3) ≤ O(m^{n/3}). In this work, we present an upper bound for r = 3 where MV(m,n,3) ≤ m^{n/6 +O(log n)}, and as a result, any 3-query matching vector code must have codeword length of N ≥ K^{6-o(1)}.  more » « less
Award ID(s):
2338730
PAR ID:
10572275
Author(s) / Creator(s):
; ; ; ;
Editor(s):
Meka, Raghu
Publisher / Repository:
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Date Published:
Volume:
325
ISSN:
1868-8969
ISBN:
978-3-95977-361-4
Page Range / eLocation ID:
325-325
Subject(s) / Keyword(s):
Locally Decodable Codes Matching Vector Families Theory of computation → Error-correcting codes
Format(s):
Medium: X Size: 19 pages; 885935 bytes Other: application/pdf
Size(s):
19 pages 885935 bytes
Right(s):
Creative Commons Attribution 4.0 International license; info:eu-repo/semantics/openAccess
Sponsoring Org:
National Science Foundation
More Like this
  1. Ta-Shma, Amnon (Ed.)
    Locally Decodable Codes (LDCs) are error-correcting codes C:Σⁿ → Σ^m, encoding messages in Σⁿ to codewords in Σ^m, with super-fast decoding algorithms. They are important mathematical objects in many areas of theoretical computer science, yet the best constructions so far have codeword length m that is super-polynomial in n, for codes with constant query complexity and constant alphabet size. In a very surprising result, Ben-Sasson, Goldreich, Harsha, Sudan, and Vadhan (SICOMP 2006) show how to construct a relaxed version of LDCs (RLDCs) with constant query complexity and almost linear codeword length over the binary alphabet, and used them to obtain significantly-improved constructions of Probabilistically Checkable Proofs. In this work, we study RLDCs in the standard Hamming-error setting, and introduce their variants in the insertion and deletion (Insdel) error setting. Standard LDCs for Insdel errors were first studied by Ostrovsky and Paskin-Cherniavsky (Information Theoretic Security, 2015), and are further motivated by recent advances in DNA random access bio-technologies. Our first result is an exponential lower bound on the length of Hamming RLDCs making 2 queries (even adaptively), over the binary alphabet. This answers a question explicitly raised by Gur and Lachish (SICOMP 2021) and is the first exponential lower bound for RLDCs. Combined with the results of Ben-Sasson et al., our result exhibits a "phase-transition"-type behavior on the codeword length for some constant-query complexity. We achieve these lower bounds via a transformation of RLDCs to standard Hamming LDCs, using a careful analysis of restrictions of message bits that fix codeword bits. We further define two variants of RLDCs in the Insdel-error setting, a weak and a strong version. On the one hand, we construct weak Insdel RLDCs with almost linear codeword length and constant query complexity, matching the parameters of the Hamming variants. On the other hand, we prove exponential lower bounds for strong Insdel RLDCs. These results demonstrate that, while these variants are equivalent in the Hamming setting, they are significantly different in the insdel setting. Our results also prove a strict separation between Hamming RLDCs and Insdel RLDCs. 
    more » « less
  2. A code C ∶ {0,1}k → {0,1}n is a q-locally decodable code (q-LDC) if one can recover any chosen bit bi of the message b ∈ {0,1}k with good confidence by randomly querying the encoding x = C(b) on at most q coordinates. Existing constructions of 2-LDCs achieve n = exp(O(k)), and lower bounds show that this is in fact tight. However, when q = 3, far less is known: the best constructions achieve n = exp(ko(1)), while the best known results only show a quadratic lower bound n ≥ Ω(k2/log(k)) on the blocklength. In this paper, we prove a near-cubic lower bound of n ≥ Ω(k3/log6(k)) on the blocklength of 3-query LDCs. This improves on the best known prior works by a polynomial factor in k. Our proof relies on a new connection between LDCs and refuting constraint satisfaction problems with limited randomness. Our quantitative improvement builds on the new techniques for refuting semirandom instances of CSPs and, in particular, relies on bounding the spectral norm of appropriate Kikuchi matrices. 
    more » « less
  3. Meka, Raghu (Ed.)
    We prove an Ω(n / k + k) communication lower bound on (k - 1)-round distributional complexity of the k-step pointer chasing problem under uniform input distribution, improving the Ω(n/k - klog n) lower bound due to Yehudayoff (Combinatorics Probability and Computing, 2020). Our lower bound almost matches the upper bound of Õ(n/k + k) communication by Nisan and Wigderson (STOC 91). As part of our approach, we put forth gadgetless lifting, a new framework that lifts lower bounds for a family of restricted protocols into lower bounds for general protocols. A key step in gadgetless lifting is choosing the appropriate definition of restricted protocols. In this paper, our definition of restricted protocols is inspired by the structure-vs-pseudorandomness decomposition by Göös, Pitassi, and Watson (FOCS 17) and Yang and Zhang (STOC 24). Previously, round-communication trade-offs were mainly obtained by round elimination and information complexity. Both methods have some barriers in some situations, and we believe gadgetless lifting could potentially address these barriers. 
    more » « less
  4. Censor-Hillel, Keren; Grandoni, Fabrizio; Ouaknine, Joel; Puppis, Gabriele (Ed.)
    We study the problem of indexing a text T[1..n] to support pattern matching with wildcards. The input of a query is a pattern P[1..m] containing h ∈ [0, k] wildcard (a.k.a. don't care) characters and the output is the set of occurrences of P in T (i.e., starting positions of substrings of T that matches P), where k = o(log n) is fixed at index construction. A classic solution by Cole et al. [STOC 2004] provides an index with space complexity O(n ⋅ (clog n)^k/k!)) and query time O(m+2^h log log n+occ), where c > 1 is a constant, and occ denotes the number of occurrences of P in T. We introduce a new data structure that significantly reduces space usage for highly repetitive texts while maintaining efficient query processing. Its space (in words) and query time are as follows: O(δ log (n/δ)⋅ c^k (1+(log^k (δ log n))/k!)) and O((m+2^h +occ)log n)) The parameter δ, known as substring complexity, is a recently introduced measure of repetitiveness that serves as a unifying and lower-bounding metric for several popular measures, including the number of phrases in the LZ77 factorization (denoted by z) and the number of runs in the Burrows-Wheeler Transform (denoted by r). Moreover, O(δ log (n/δ)) represents the optimal space required to encode the data in terms of n and δ, helping us see how close our space is to the minimum required. In another trade-off, we match the query time of Cole et al.’s index using O(n+δ log (n/δ) ⋅ (clogδ)^{k+ε}/k!) space, where ε > 0 is an arbitrarily small constant. We also demonstrate how these techniques can be applied to a more general indexing problem, where the query pattern includes k-gaps (a gap can be interpreted as a contiguous sequence of wildcard characters). 
    more » « less
  5. Gørtz, Inge Li; Farach-Colton, Martin; Puglisi, Simon J; Herman, Grzegorz (Ed.)
    We give the first almost-linear time algorithm for computing the maximal k-edge-connected subgraphs of an undirected unweighted graph for any constant k. More specifically, given an n-vertex m-edge graph G = (V,E) and a number k = log^o(1) n, we can deterministically compute in O(m+n^{1+o(1)}) time the unique vertex partition {V_1,… ,V_z} such that, for every i, V_i induces a k-edge-connected subgraph while every superset V'_i ⊃ V_{i} does not. Previous algorithms with linear time work only when k ≤ 2 [Tarjan SICOMP'72], otherwise they all require Ω(m+n√n) time even when k = 3 [Chechik et al. SODA'17; Forster et al. SODA'20]. Our algorithm also extends to the decremental graph setting; we can deterministically maintain the maximal k-edge-connected subgraphs of a graph undergoing edge deletions in m^{1+o(1)} total update time. Our key idea is a reduction to the dynamic algorithm supporting pairwise k-edge-connectivity queries [Jin and Sun FOCS'20]. 
    more » « less