Code Sparsification and its Applications

Khanna, Sanjeev; Putterman, Aaron Louie; Sudan, Madhu

We introduce a notion of code sparsification that generalizes the notion of cut sparsification in graphs. For a (linear) code C ⊆ 𝔽nq of dimension k a (1 ± ɛ)-sparsification of size s is given by a weighted set S ⊆ [n] with |S| ≤ s such that for every codeword c ∈ C the projection c|s of c to the set S has (weighted) hamming weight which is a (1 ± ɛ) approximation of the hamming weight of c. We show that for every code there exists a (1 ± ɛ)-sparsification of size s = Õ(k log(q)/ɛ2). This immediately implies known results on graph and hypergraph cut sparsification up to polylogarithmic factors (with a simple unified proof) — the former follows from the well-known fact that cuts in a graph form a linear code over 𝔽2, while the latter is obtained by a simple encoding of hypergraph cuts. Further, by connections between the eigenvalues of the Laplacians of Cayley graphs over to the weights of codewords, we also give the first proof of the existence of spectral Cayley graph sparsifiers over by Cayley graphs, i.e., where we sparsify the set of generators to nearly-optimal size. Additionally, this work can be viewed as a continuation of a line of works on building sparsifiers for constraint satisfaction problems (CSPs); this result shows that there exist near-linear size sparsifiers for CSPs over 𝔽p-valued variables whose unsatisfying assignments can be expressed as the zeros of a linear equation modulo a prime p. As an application we give a full characterization of ternary Boolean CSPs (CSPs where the underlying predicate acts on three Boolean variables) that allow for near-linear size sparsification. This makes progress on a question posed by Kogan and Krauthgamer (ITCS 2015) asking which CSPs allow for near-linear size sparsifiers (in the number of variables). At the heart of our result is a codeword counting bound that we believe is of independent interest. Indeed, extending Karger's cut-counting bound (SODA 1993), we show a novel decomposition theorem of linear codes: we show that every linear code has a (relatively) small subset of coordinates such that after deleting those coordinates, the code on the remaining coordinates has a smooth upper bound on the number of codewords of small weight. Using the deleted coordinates in addition to a (weighted) random sample of the remaining coordinates now allows us to sparsify the whole code. The proof of this decomposition theorem extends Karger's proof (and the contraction method) in a clean way, while enabling the extensions listed above without any additional complexity in the proofs.

More Like this