skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on January 1, 2026

Title: Improved Lower Bounds for 3-Query Matching Vector Codes
A Matching Vector (MV) family modulo a positive integer m ≥ 2 is a pair of ordered lists U = (u_1, ⋯, u_K) and V = (v_1, ⋯, v_K) where u_i, v_j ∈ ℤ_m^n with the following property: for any i ∈ [K], the inner product ⟨u_i, v_i⟩ = 0 mod m, and for any i ≠ j, ⟨u_i, v_j⟩ ≠ 0 mod m. An MV family is called r-restricted if inner products ⟨u_i, v_j⟩, for all i,j, take at most r different values. The r-restricted MV families are extremely important since the only known construction of constant-query subexponential locally decodable codes (LDCs) are based on them. Such LDCs constructed via matching vector families are called matching vector codes. Let MV(m,n) (respectively MV(m, n, r)) denote the largest K such that there exists an MV family (respectively r-restricted MV family) of size K in ℤ_m^n. Such a MV family can be transformed in a black-box manner to a good r-query locally decodable code taking messages of length K to codewords of length N = m^n. For small prime m, an almost tight bound MV(m,n) ≤ O(m^{n/2}) was first shown by Dvir, Gopalan, Yekhanin (FOCS'10, SICOMP'11), while for general m, the same paper established an upper bound of O(m^{n-1+o_m(1)}), with o_m(1) denoting a function that goes to zero when m grows. For any arbitrary constant r ≥ 3 and composite m, the best upper bound till date on MV(m,n,r) is O(m^{n/2}), is due to Bhowmick, Dvir and Lovett (STOC'13, SICOMP'14).In a breakthrough work, Alrabiah, Guruswami, Kothari and Manohar (STOC'23) implicitly improve this bound for 3-restricted families to MV(m, n, 3) ≤ O(m^{n/3}). In this work, we present an upper bound for r = 3 where MV(m,n,3) ≤ m^{n/6 +O(log n)}, and as a result, any 3-query matching vector code must have codeword length of N ≥ K^{6-o(1)}.  more » « less
Award ID(s):
2338730
PAR ID:
10572275
Author(s) / Creator(s):
; ; ; ;
Editor(s):
Meka, Raghu
Publisher / Repository:
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Date Published:
Volume:
325
ISSN:
1868-8969
ISBN:
978-3-95977-361-4
Page Range / eLocation ID:
325-325
Subject(s) / Keyword(s):
Locally Decodable Codes Matching Vector Families Theory of computation → Error-correcting codes
Format(s):
Medium: X Size: 19 pages; 885935 bytes Other: application/pdf
Size(s):
19 pages 885935 bytes
Right(s):
Creative Commons Attribution 4.0 International license; info:eu-repo/semantics/openAccess
Sponsoring Org:
National Science Foundation
More Like this
  1. Ta-Shma, Amnon (Ed.)
    Locally Decodable Codes (LDCs) are error-correcting codes C:Σⁿ → Σ^m, encoding messages in Σⁿ to codewords in Σ^m, with super-fast decoding algorithms. They are important mathematical objects in many areas of theoretical computer science, yet the best constructions so far have codeword length m that is super-polynomial in n, for codes with constant query complexity and constant alphabet size. In a very surprising result, Ben-Sasson, Goldreich, Harsha, Sudan, and Vadhan (SICOMP 2006) show how to construct a relaxed version of LDCs (RLDCs) with constant query complexity and almost linear codeword length over the binary alphabet, and used them to obtain significantly-improved constructions of Probabilistically Checkable Proofs. In this work, we study RLDCs in the standard Hamming-error setting, and introduce their variants in the insertion and deletion (Insdel) error setting. Standard LDCs for Insdel errors were first studied by Ostrovsky and Paskin-Cherniavsky (Information Theoretic Security, 2015), and are further motivated by recent advances in DNA random access bio-technologies. Our first result is an exponential lower bound on the length of Hamming RLDCs making 2 queries (even adaptively), over the binary alphabet. This answers a question explicitly raised by Gur and Lachish (SICOMP 2021) and is the first exponential lower bound for RLDCs. Combined with the results of Ben-Sasson et al., our result exhibits a "phase-transition"-type behavior on the codeword length for some constant-query complexity. We achieve these lower bounds via a transformation of RLDCs to standard Hamming LDCs, using a careful analysis of restrictions of message bits that fix codeword bits. We further define two variants of RLDCs in the Insdel-error setting, a weak and a strong version. On the one hand, we construct weak Insdel RLDCs with almost linear codeword length and constant query complexity, matching the parameters of the Hamming variants. On the other hand, we prove exponential lower bounds for strong Insdel RLDCs. These results demonstrate that, while these variants are equivalent in the Hamming setting, they are significantly different in the insdel setting. Our results also prove a strict separation between Hamming RLDCs and Insdel RLDCs. 
    more » « less
  2. A code C ∶ {0,1}k → {0,1}n is a q-locally decodable code (q-LDC) if one can recover any chosen bit bi of the message b ∈ {0,1}k with good confidence by randomly querying the encoding x = C(b) on at most q coordinates. Existing constructions of 2-LDCs achieve n = exp(O(k)), and lower bounds show that this is in fact tight. However, when q = 3, far less is known: the best constructions achieve n = exp(ko(1)), while the best known results only show a quadratic lower bound n ≥ Ω(k2/log(k)) on the blocklength. In this paper, we prove a near-cubic lower bound of n ≥ Ω(k3/log6(k)) on the blocklength of 3-query LDCs. This improves on the best known prior works by a polynomial factor in k. Our proof relies on a new connection between LDCs and refuting constraint satisfaction problems with limited randomness. Our quantitative improvement builds on the new techniques for refuting semirandom instances of CSPs and, in particular, relies on bounding the spectral norm of appropriate Kikuchi matrices. 
    more » « less
  3. Censor-Hillel, Keren; Grandoni, Fabrizio; Ouaknine, Joel; Puppis, Gabriele (Ed.)
    We study the problem of indexing a text T[1..n] to support pattern matching with wildcards. The input of a query is a pattern P[1..m] containing h ∈ [0, k] wildcard (a.k.a. don't care) characters and the output is the set of occurrences of P in T (i.e., starting positions of substrings of T that matches P), where k = o(log n) is fixed at index construction. A classic solution by Cole et al. [STOC 2004] provides an index with space complexity O(n ⋅ (clog n)^k/k!)) and query time O(m+2^h log log n+occ), where c > 1 is a constant, and occ denotes the number of occurrences of P in T. We introduce a new data structure that significantly reduces space usage for highly repetitive texts while maintaining efficient query processing. Its space (in words) and query time are as follows: O(δ log (n/δ)⋅ c^k (1+(log^k (δ log n))/k!)) and O((m+2^h +occ)log n)) The parameter δ, known as substring complexity, is a recently introduced measure of repetitiveness that serves as a unifying and lower-bounding metric for several popular measures, including the number of phrases in the LZ77 factorization (denoted by z) and the number of runs in the Burrows-Wheeler Transform (denoted by r). Moreover, O(δ log (n/δ)) represents the optimal space required to encode the data in terms of n and δ, helping us see how close our space is to the minimum required. In another trade-off, we match the query time of Cole et al.’s index using O(n+δ log (n/δ) ⋅ (clogδ)^{k+ε}/k!) space, where ε > 0 is an arbitrarily small constant. We also demonstrate how these techniques can be applied to a more general indexing problem, where the query pattern includes k-gaps (a gap can be interpreted as a contiguous sequence of wildcard characters). 
    more » « less
  4. Gørtz, Inge Li; Farach-Colton, Martin; Puglisi, Simon J; Herman, Grzegorz (Ed.)
    We give the first almost-linear time algorithm for computing the maximal k-edge-connected subgraphs of an undirected unweighted graph for any constant k. More specifically, given an n-vertex m-edge graph G = (V,E) and a number k = log^o(1) n, we can deterministically compute in O(m+n^{1+o(1)}) time the unique vertex partition {V_1,… ,V_z} such that, for every i, V_i induces a k-edge-connected subgraph while every superset V'_i ⊃ V_{i} does not. Previous algorithms with linear time work only when k ≤ 2 [Tarjan SICOMP'72], otherwise they all require Ω(m+n√n) time even when k = 3 [Chechik et al. SODA'17; Forster et al. SODA'20]. Our algorithm also extends to the decremental graph setting; we can deterministically maintain the maximal k-edge-connected subgraphs of a graph undergoing edge deletions in m^{1+o(1)} total update time. Our key idea is a reduction to the dynamic algorithm supporting pairwise k-edge-connectivity queries [Jin and Sun FOCS'20]. 
    more » « less
  5. We describe a randomized algorithm for producing a near-optimal hierarchical off-diagonal low-rank (HODLR) approximation to an n × n matrix A, accessible only though matrix-vector products with A and AT. We prove that, for the rank-k HODLR approximation problem, our method achieves a (1 + β )log(n )-optimal approximation in expected Frobenius norm using O (k log(n )/β3) matrix-vector products. In particular, the algorithm obtains a (1 + ∈ )-optimal approximation with O (k log4(n )/∈3) matrix-vector products, and for any constant c, an nc-optimal approximation with O (k log(n )) matrix-vector products. Apart from matrix-vector products, the additional computational cost of our method is just O (n poly(log(n ), k, β )). We complement the upper bound with a lower bound, which shows that any matrix-vector query algorithm requires at least Ω(k log(n ) + k/ε ) queries to obtain a (1 + ε )-optimal approximation. Our algorithm can be viewed as a robust version of widely used “peeling” methods for recovering HODLR matrices and is, to the best of our knowledge, the first matrix-vector query algorithm to enjoy theoretical worst- case guarantees for approximation by any hierarchical matrix class. To control the propagation of error between levels of hierarchical approximation, we introduce a new perturbation bound for low-rank approximation, which shows that the widely used Generalized Nyström method enjoys inherent stability when implemented with noisy matrix-vector products. We also introduce a novel randomly perforated matrix sketching method to further control the error in the peeling algorithm. 
    more » « less