We initiate a systematic study of linear sketching over $$\ftwo$$. For a given Boolean function treated as $$f \colon \ftwo^n \to \ftwo$$ a randomized $$\ftwo$$-sketch is a distribution $$\mathcal M$$ over $$d \times n$$ matrices with elements over $$\ftwo$$ such that $$\mathcal Mx$$ suffices for computing $f(x)$ with high probability. Such sketches for $$d \ll n$$ can be used to design small-space distributed and streaming algorithms. Motivated by these applications we study a connection between $$\ftwo$$-sketching and a two-player one-way communication game for the corresponding XOR-function. We conjecture that $$\ftwo$$-sketching is optimal for this communication game. Our results confirm this conjecture for multiple important classes of functions: 1) low-degree $$\ftwo$$-polynomials, 2) functions with sparse Fourier spectrum, 3) most symmetric functions, 4) recursive majority function. These results rely on a new structural theorem that shows that $$\ftwo$$-sketching is optimal (up to constant factors) for uniformly distributed inputs. Furthermore, we show that (non-uniform) streaming algorithms that have to process random updates over $$\ftwo$$ can be constructed as $$\ftwo$$-sketches for the uniform distribution. In contrast with the previous work of Li, Nguyen and Woodruff (STOC'14) who show an analogous result for linear sketches over integers in the adversarial setting our result does not require the stream length to be triply exponential in $$n$$ and holds for streams of length $$\tilde O(n)$$ constructed through uniformly random updates.
more »
« less
Linear Sketching over F_2
We initiate a systematic study of linear sketching over F_2. For a given Boolean function treated as f : F_2^n -> F_2 a randomized F_2-sketch is a distribution M over d x n matrices with elements over F_2 such that Mx suffices for computing f(x) with high probability. Such sketches for d << n can be used to design small-space distributed and streaming algorithms. Motivated by these applications we study a connection between F_2-sketching and a two-player one-way communication game for the corresponding XOR-function. We conjecture that F_2-sketching is optimal for this communication game. Our results confirm this conjecture for multiple important classes of functions: 1) low-degree F_2-polynomials, 2) functions with sparse Fourier spectrum, 3) most symmetric functions, 4) recursive majority function. These results rely on a new structural theorem that shows that F_2-sketching is optimal (up to constant factors) for uniformly distributed inputs. Furthermore, we show that (non-uniform) streaming algorithms that have to process random updates over F_2 can be constructed as F_2-sketches for the uniform distribution. In contrast with the previous work of Li, Nguyen and Woodruff (STOC'14) who show an analogous result for linear sketches over integers in the adversarial setting our result does not require the stream length to be triply exponential in n and holds for streams of length O(n) constructed through uniformly random updates.
more »
« less
- Award ID(s):
- 1733794
- PAR ID:
- 10108003
- Date Published:
- Journal Name:
- Leibniz international proceedings in informatics
- Volume:
- 102
- ISSN:
- 1868-8969
- Page Range / eLocation ID:
- 8:1 - 8:37
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Chakrabarti, Amit; Swamy, Chaitanya (Ed.)A Boolean maximum constraint satisfaction problem, Max-CSP(f), is specified by a predicate f:{-1,1}^k → {0,1}. An n-variable instance of Max-CSP(f) consists of a list of constraints, each of which applies f to k distinct literals drawn from the n variables. For k = 2, Chou, Golovnev, and Velusamy [Chou et al., 2020] obtained explicit ratios characterizing the √ n-space streaming approximability of every predicate. For k ≥ 3, Chou, Golovnev, Sudan, and Velusamy [Chou et al., 2022] proved a general dichotomy theorem for √ n-space sketching algorithms: For every f, there exists α(f) ∈ (0,1] such that for every ε > 0, Max-CSP(f) is (α(f)-ε)-approximable by an O(log n)-space linear sketching algorithm, but (α(f)+ε)-approximation sketching algorithms require Ω(√n) space. In this work, we give closed-form expressions for the sketching approximation ratios of multiple families of symmetric Boolean functions. Letting α'_k = 2^{-(k-1)} (1-k^{-2})^{(k-1)/2}, we show that for odd k ≥ 3, α(kAND) = α'_k, and for even k ≥ 2, α(kAND) = 2α'_{k+1}. Thus, for every k, kAND can be (2-o(1))2^{-k}-approximated by O(log n)-space sketching algorithms; we contrast this with a lower bound of Chou, Golovnev, Sudan, Velingker, and Velusamy [Chou et al., 2022] implying that streaming (2+ε)2^{-k}-approximations require Ω(n) space! We also resolve the ratio for the "at-least-(k-1)-1’s" function for all even k; the "exactly-(k+1)/2-1’s" function for odd k ∈ {3,…,51}; and fifteen other functions. We stress here that for general f, the dichotomy theorem in [Chou et al., 2022] only implies that α(f) can be computed to arbitrary precision in PSPACE, and thus closed-form expressions need not have existed a priori. Our analyses involve identifying and exploiting structural "saddle-point" properties of this dichotomy. Separately, for all threshold functions, we give optimal "bias-based" approximation algorithms generalizing [Chou et al., 2020] while simplifying [Chou et al., 2022]. Finally, we investigate the √ n-space streaming lower bounds in [Chou et al., 2022], and show that they are incomplete for 3AND, i.e., they fail to rule out (α(3AND})-ε)-approximations in o(√ n) space.more » « less
-
Let f: {0, 1}n → {0, 1} be a boolean function, and let f∧(x, y) = f(x ∧ y) denote the AND-function of f, where x ∧ y denotes bit-wise AND. We study the deterministic communication complexity of f∧ and show that, up to a logn factor, it is bounded by a polynomial in the logarithm of the real rank of the communication matrix of f∧. This comes within a logn factor of establishing the log-rank conjecture for AND-functions with no assumptions on f. Our result stands in contrast with previous results on special cases of the log-rank conjecture, which needed significant restrictions on f such as monotonicity or low F2-degree. Our techniques can also be used to prove (within a logn factor) a lifting theorem for AND-functions, stating that the deterministic communication complexity of f∧ is polynomially related to the AND-decision tree complexity of f. The results rely on a new structural result regarding boolean functions f: {0, 1}n → {0, 1} with a sparse polynomial representation, which may be of independent interest. We show that if the polynomial computing f has few monomials then the set system of the monomials has a small hitting set, of size poly-logarithmic in its sparsity. We also establish extensions of this result to multi-linear polynomials f: {0, 1}n → with a larger range.more » « less
-
This work studies the problem of finding a large affine subspace over the field F_2 such that a bounded function's nontrivial Fourier coefficients become small. We show that for any function f from F_2^n to [-1,1] with Fourier degree d, there exists an affine subspace of dimension at least roughly n^{1/d!}k^{-2}), wherein all of f's nontrivial Fourier coefficients become smaller than 2^{-k}. To complement this result, we show the existence of degree d functions with coefficients larger than 2^{-d log n} on any subspace of dimension larger than Omega(dn^{1/(d-1)}). In addition, we give explicit examples of functions with analogous but weaker properties. Along the way, we provide multiple characterizations of the Fourier coefficients of functions restricted to subspaces of $$\F_2^n$$ that may be useful in other contexts. Finally, we highlight applications and connections of our results to parity kill number and affine dispersers/extractors.more » « less
-
Let L be a language that can be decided in linear space and let ϵ>0 be any constant. Let A be the exponential hardness assumption that for every n, membership in L for inputs of length n cannot be decided by circuits of size smaller than 2ϵn. We prove that for every function f:{0,1}∗→{0,1}, computable by a randomized logspace algorithm R, there exists a deterministic logspace algorithm D (attempting to compute f), such that on every input x of length n, the algorithm D outputs one of the following:1)The correct value f(x).2)The string: “I am unable to compute f(x) because the hardness assumption A is false”, followed by a (provenly correct) circuit of size smaller than 2ϵn′ for membership in L for inputs of length n′, for some n′=Θ(logn); that is, a circuit that refutes A. Moreover, D is explicitly constructed, given R.We note that previous works on the hardness-versus-randomness paradigm give derandomized algorithms that rely blindly on the hardness assumption. If the hardness assumption is false, the algorithms may output incorrect values, and thus a user cannot trust that an output given by the algorithm is correct. Instead, our algorithm D verifies the computation so that it never outputs an incorrect value. Thus, if D outputs a value for f(x), that value is certified to be correct. Moreover, if D does not output a value for f(x), it alerts that the hardness assumption was found to be false, and refutes the assumption.Our next result is a universal derandomizer for BPL (the class of problems solvable by bounded-error randomized logspace algorithms) 1 : We give a deterministic algorithm U that takes as an input a randomized logspace algorithm R and an input x and simulates the computation of R on x, deteriministically. Under the widely believed assumption BPL=L, the space ...more » « less