We initiate a systematic study of linear sketching over $$\ftwo$$. For a given Boolean function treated as $$f \colon \ftwo^n \to \ftwo$$ a randomized $$\ftwo$$-sketch is a distribution $$\mathcal M$$ over $$d \times n$$ matrices with elements over $$\ftwo$$ such that $$\mathcal Mx$$ suffices for computing $f(x)$ with high probability. Such sketches for $$d \ll n$$ can be used to design small-space distributed and streaming algorithms. Motivated by these applications we study a connection between $$\ftwo$$-sketching and a two-player one-way communication game for the corresponding XOR-function. We conjecture that $$\ftwo$$-sketching is optimal for this communication game. Our results confirm this conjecture for multiple important classes of functions: 1) low-degree $$\ftwo$$-polynomials, 2) functions with sparse Fourier spectrum, 3) most symmetric functions, 4) recursive majority function. These results rely on a new structural theorem that shows that $$\ftwo$$-sketching is optimal (up to constant factors) for uniformly distributed inputs. Furthermore, we show that (non-uniform) streaming algorithms that have to process random updates over $$\ftwo$$ can be constructed as $$\ftwo$$-sketches for the uniform distribution. In contrast with the previous work of Li, Nguyen and Woodruff (STOC'14) who show an analogous result for linear sketches over integers in the adversarial setting our result does not require the stream length to be triply exponential in $$n$$ and holds for streams of length $$\tilde O(n)$$ constructed through uniformly random updates.
more »
« less
Linear Sketching over F_2
We initiate a systematic study of linear sketching over F_2. For a given Boolean function treated as f : F_2^n -> F_2 a randomized F_2-sketch is a distribution M over d x n matrices with elements over F_2 such that Mx suffices for computing f(x) with high probability. Such sketches for d << n can be used to design small-space distributed and streaming algorithms. Motivated by these applications we study a connection between F_2-sketching and a two-player one-way communication game for the corresponding XOR-function. We conjecture that F_2-sketching is optimal for this communication game. Our results confirm this conjecture for multiple important classes of functions: 1) low-degree F_2-polynomials, 2) functions with sparse Fourier spectrum, 3) most symmetric functions, 4) recursive majority function. These results rely on a new structural theorem that shows that F_2-sketching is optimal (up to constant factors) for uniformly distributed inputs. Furthermore, we show that (non-uniform) streaming algorithms that have to process random updates over F_2 can be constructed as F_2-sketches for the uniform distribution. In contrast with the previous work of Li, Nguyen and Woodruff (STOC'14) who show an analogous result for linear sketches over integers in the adversarial setting our result does not require the stream length to be triply exponential in n and holds for streams of length O(n) constructed through uniformly random updates.
more »
« less
- Award ID(s):
- 1733794
- PAR ID:
- 10108003
- Date Published:
- Journal Name:
- Leibniz international proceedings in informatics
- Volume:
- 102
- ISSN:
- 1868-8969
- Page Range / eLocation ID:
- 8:1 - 8:37
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Chakrabarti, Amit; Swamy, Chaitanya (Ed.)A Boolean maximum constraint satisfaction problem, Max-CSP(f), is specified by a predicate f:{-1,1}^k → {0,1}. An n-variable instance of Max-CSP(f) consists of a list of constraints, each of which applies f to k distinct literals drawn from the n variables. For k = 2, Chou, Golovnev, and Velusamy [Chou et al., 2020] obtained explicit ratios characterizing the √ n-space streaming approximability of every predicate. For k ≥ 3, Chou, Golovnev, Sudan, and Velusamy [Chou et al., 2022] proved a general dichotomy theorem for √ n-space sketching algorithms: For every f, there exists α(f) ∈ (0,1] such that for every ε > 0, Max-CSP(f) is (α(f)-ε)-approximable by an O(log n)-space linear sketching algorithm, but (α(f)+ε)-approximation sketching algorithms require Ω(√n) space. In this work, we give closed-form expressions for the sketching approximation ratios of multiple families of symmetric Boolean functions. Letting α'_k = 2^{-(k-1)} (1-k^{-2})^{(k-1)/2}, we show that for odd k ≥ 3, α(kAND) = α'_k, and for even k ≥ 2, α(kAND) = 2α'_{k+1}. Thus, for every k, kAND can be (2-o(1))2^{-k}-approximated by O(log n)-space sketching algorithms; we contrast this with a lower bound of Chou, Golovnev, Sudan, Velingker, and Velusamy [Chou et al., 2022] implying that streaming (2+ε)2^{-k}-approximations require Ω(n) space! We also resolve the ratio for the "at-least-(k-1)-1’s" function for all even k; the "exactly-(k+1)/2-1’s" function for odd k ∈ {3,…,51}; and fifteen other functions. We stress here that for general f, the dichotomy theorem in [Chou et al., 2022] only implies that α(f) can be computed to arbitrary precision in PSPACE, and thus closed-form expressions need not have existed a priori. Our analyses involve identifying and exploiting structural "saddle-point" properties of this dichotomy. Separately, for all threshold functions, we give optimal "bias-based" approximation algorithms generalizing [Chou et al., 2020] while simplifying [Chou et al., 2022]. Finally, we investigate the √ n-space streaming lower bounds in [Chou et al., 2022], and show that they are incomplete for 3AND, i.e., they fail to rule out (α(3AND})-ε)-approximations in o(√ n) space.more » « less
-
Meka, Raghu (Ed.)In the d-dimensional turnstile streaming model, a frequency vector 𝐱 = (𝐱(1),…,𝐱(n)) ∈ (ℝ^d)ⁿ is updated entry-wisely over a stream. We consider the problem of f-moment estimation for which one wants to estimate f(𝐱)=∑_{v ∈ [n]}f(𝐱(v)) with a small-space sketch. A function f is tractable if the f-moment can be estimated to within a constant factor using polylog(n) space. The f-moment estimation problem has been intensively studied in the d = 1 case. Flajolet and Martin estimate the F₀-moment (f(x) = 1 (x > 0), incremental stream); Alon, Matias, and Szegedy estimate the L₂-moment (f(x) = x²); Indyk estimates the L_α-moment (f(x) = |x|^α), α ∈ (0,2]. For d ≥ 2, Ganguly, Bansal, and Dube estimate the L_{p,q} hybrid moment (f:ℝ^d → ℝ,f(x) = (∑_{j = 1}^d |x_j|^p)^q), p ∈ (0,2],q ∈ (0,1). For tractability, Bar-Yossef, Jayram, Kumar, and Sivakumar show that f(x) = |x|^α is not tractable for α > 2. Braverman, Chestnut, Woodruff, and Yang characterize the class of tractable one-variable functions except for a class of nearly periodic functions. In this work we present a simple and generic scheme to construct sketches with the novel idea of hashing indices to Lévy processes, from which one can estimate the f-moment f(𝐱) where f is the characteristic exponent of the Lévy process. The fundamental Lévy-Khintchine representation theorem completely characterizes the space of all possible characteristic exponents, which in turn characterizes the set of f-moments that can be estimated by this generic scheme. The new scheme has strong explanatory power. It unifies the construction of many existing sketches (F₀, L₀, L₂, L_α, L_{p,q}, etc.) and it implies the tractability of many nearly periodic functions that were previously unclassified. Furthermore, the scheme can be conveniently generalized to multidimensional cases (d ≥ 2) by considering multidimensional Lévy processes and can be further generalized to estimate heterogeneous moments by projecting different indices with different Lévy processes. We conjecture that the set of tractable functions can be characterized using the Lévy-Khintchine representation theorem via what we called the Fourier-Hahn-Lévy method.more » « less
-
This work studies the problem of finding a large affine subspace over the field F_2 such that a bounded function's nontrivial Fourier coefficients become small. We show that for any function f from F_2^n to [-1,1] with Fourier degree d, there exists an affine subspace of dimension at least roughly n^{1/d!}k^{-2}), wherein all of f's nontrivial Fourier coefficients become smaller than 2^{-k}. To complement this result, we show the existence of degree d functions with coefficients larger than 2^{-d log n} on any subspace of dimension larger than Omega(dn^{1/(d-1)}). In addition, we give explicit examples of functions with analogous but weaker properties. Along the way, we provide multiple characterizations of the Fourier coefficients of functions restricted to subspaces of $$\F_2^n$$ that may be useful in other contexts. Finally, we highlight applications and connections of our results to parity kill number and affine dispersers/extractors.more » « less
-
Let f: {0, 1}n → {0, 1} be a boolean function, and let f∧(x, y) = f(x ∧ y) denote the AND-function of f, where x ∧ y denotes bit-wise AND. We study the deterministic communication complexity of f∧ and show that, up to a logn factor, it is bounded by a polynomial in the logarithm of the real rank of the communication matrix of f∧. This comes within a logn factor of establishing the log-rank conjecture for AND-functions with no assumptions on f. Our result stands in contrast with previous results on special cases of the log-rank conjecture, which needed significant restrictions on f such as monotonicity or low F2-degree. Our techniques can also be used to prove (within a logn factor) a lifting theorem for AND-functions, stating that the deterministic communication complexity of f∧ is polynomially related to the AND-decision tree complexity of f. The results rely on a new structural result regarding boolean functions f: {0, 1}n → {0, 1} with a sparse polynomial representation, which may be of independent interest. We show that if the polynomial computing f has few monomials then the set system of the monomials has a small hitting set, of size poly-logarithmic in its sparsity. We also establish extensions of this result to multi-linear polynomials f: {0, 1}n → with a larger range.more » « less
An official website of the United States government

