skip to main content

Title: Approximating the Distance to Monotonicity of Boolean Functions
We design a nonadaptive algorithm that, given a Boolean function f: {0, 1}^n → {0, 1} which is α-far from monotone, makes poly(n, 1/α) queries and returns an estimate that, with high probability, is an O-tilde(\sqrt{n})-approximation to the distance of f to monotonicity. Furthermore, we show that for any constant k > 0, approximating the distance to monotonicity up to n^(1/2−k)-factor requires 2^{n^k} nonadaptive queries, thereby ruling out a poly(n, 1/α)-query nonadaptive algorithm for such approximations. This answers a question of Seshadhri (Property Testing Review, 2014) for the case of nonadaptive algorithms. Approximating the distance to a property is closely related to tolerantly testing that property. Our lower bound stands in contrast to standard (non-tolerant) testing of monotonicity that can be done nonadaptively with O-tilde(n/ε^2) queries. We obtain our lower bound by proving an analogous bound for erasure-resilient testers. An α-erasure-resilient tester for a desired property gets oracle access to a function that has at most an α fraction of values erased. The tester has to accept (with probability at least 2/3) if the erasures can be filled in to ensure that the resulting function has the property and to reject (with probability at least 2/3) if every completion of erasures more » results in a function that is ε-far from having the property. Our method yields the same lower bounds for unateness and being a k-junta. These lower bounds improve exponentially on the existing lower bounds for these properties. « less
Authors:
; ;
Award ID(s):
1909612
Publication Date:
NSF-PAR ID:
10185525
Journal Name:
Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, SODA 2020
Page Range or eLocation-ID:
1995--2009
Sponsoring Org:
National Science Foundation
More Like this
  1. The approximate degree of a Boolean function f is the least degree of a real polynomial that approximates f pointwise to error at most 1/3. The approximate degree of f is known to be a lower bound on the quantum query complexity of f (Beals et al., FOCS 1998 and J. ACM 2001). We find tight or nearly tight bounds on the approximate degree and quantum query complexities of several basic functions. Specifically, we show the following. k-Distinctness: For any constant k, the approximate degree and quantum query complexity of the k-distinctness function is Ω(n3/4−1/(2k)). This is nearly tight for large k, as Belovs (FOCS 2012) has shown that for any constant k, the approximate degree and quantum query complexity of k-distinctness is O(n3/4−1/(2k+2−4)). Image size testing: The approximate degree and quantum query complexity of testing the size of the image of a function [n]→[n] is Ω~(n1/2). This proves a conjecture of Ambainis et al. (SODA 2016), and it implies tight lower bounds on the approximate degree and quantum query complexity of the following natural problems. k-Junta testing: A tight Ω~(k1/2) lower bound for k-junta testing, answering the main open question of Ambainis et al. (SODA 2016). Statistical distance frommore »uniform: A tight Ω~(n1/2) lower bound for approximating the statistical distance of a distribution from uniform, answering the main question left open by Bravyi et al. (STACS 2010 and IEEE Trans. Inf. Theory 2011). Shannon entropy: A tight Ω~(n1/2) lower bound for approximating Shannon entropy up to a certain additive constant, answering a question of Li and Wu (2017). Surjectivity: The approximate degree of the surjectivity function is Ω~(n3/4). The best prior lower bound was Ω(n2/3). Our result matches an upper bound of O~(n3/4) due to Sherstov (STOC 2018), which we reprove using different techniques. The quantum query complexity of this function is known to be Θ(n) (Beame and Machmouchi, Quantum Inf. Comput. 2012 and Sherstov, FOCS 2015). Our upper bound for surjectivity introduces new techniques for approximating Boolean functions by low-degree polynomials. Our lower bounds are proved by significantly refining techniques recently introduced by Bun and Thaler (FOCS 2017).« less
  2. The noise sensitivity of a Boolean function f: {0,1}^n - > {0,1} is one of its fundamental properties. For noise parameter delta, the noise sensitivity is denoted as NS_{delta}[f]. This quantity is defined as follows: First, pick x = (x_1,...,x_n) uniformly at random from {0,1}^n, then pick z by flipping each x_i independently with probability delta. NS_{delta}[f] is defined to equal Pr [f(x) != f(z)]. Much of the existing literature on noise sensitivity explores the following two directions: (1) Showing that functions with low noise-sensitivity are structured in certain ways. (2) Mathematically showing that certain classes of functions have low noise sensitivity. Combined, these two research directions show that certain classes of functions have low noise sensitivity and therefore have useful structure. The fundamental importance of noise sensitivity, together with this wealth of structural results, motivates the algorithmic question of approximating NS_{delta}[f] given an oracle access to the function f. We show that the standard sampling approach is essentially optimal for general Boolean functions. Therefore, we focus on estimating the noise sensitivity of monotone functions, which form an important subclass of Boolean functions, since many functions of interest are either monotone or can be simply transformed into a monotone functionmore »(for example the class of unate functions consists of all the functions that can be made monotone by reorienting some of their coordinates [O'Donnell, 2014]). Specifically, we study the algorithmic problem of approximating NS_{delta}[f] for monotone f, given the promise that NS_{delta}[f] >= 1/n^{C} for constant C, and for delta in the range 1/n <= delta <= 1/2. For such f and delta, we give a randomized algorithm performing O((min(1,sqrt{n} delta log^{1.5} n))/(NS_{delta}[f]) poly (1/epsilon)) queries and approximating NS_{delta}[f] to within a multiplicative factor of (1 +/- epsilon). Given the same constraints on f and delta, we also prove a lower bound of Omega((min(1,sqrt{n} delta))/(NS_{delta}[f] * n^{xi})) on the query complexity of any algorithm that approximates NS_{delta}[f] to within any constant factor, where xi can be any positive constant. Thus, our algorithm's query complexity is close to optimal in terms of its dependence on n. We introduce a novel descending-ascending view of noise sensitivity, and use it as a central tool for the analysis of our algorithm. To prove lower bounds on query complexity, we develop a technique that reduces computational questions about query complexity to combinatorial questions about the existence of "thin" functions with certain properties. The existence of such "thin" functions is proved using the probabilistic method. These techniques also yield new lower bounds on the query complexity of approximating other fundamental properties of Boolean functions: the total influence and the bias.« less
  3. We study the problem of testing identity against a given distribution with a focus on the high confidence regime. More precisely, given samples from an unknown distribution p over n elements, an explicitly given distribution q, and parameters 0< epsilon, delta < 1, we wish to distinguish, with probability at least 1-delta, whether the distributions are identical versus epsilon-far in total variation distance. Most prior work focused on the case that delta = Omega(1), for which the sample complexity of identity testing is known to be Theta(sqrt{n}/epsilon^2). Given such an algorithm, one can achieve arbitrarily small values of delta via black-box amplification, which multiplies the required number of samples by Theta(log(1/delta)). We show that black-box amplification is suboptimal for any delta = o(1), and give a new identity tester that achieves the optimal sample complexity. Our new upper and lower bounds show that the optimal sample complexity of identity testing is Theta((1/epsilon^2) (sqrt{n log(1/delta)} + log(1/delta))) for any n, epsilon, and delta. For the special case of uniformity testing, where the given distribution is the uniform distribution U_n over the domain, our new tester is surprisingly simple: to test whether p = U_n versus d_{TV} (p, U_n) >= epsilon, wemore »simply threshold d_{TV}({p^}, U_n), where {p^} is the empirical probability distribution. The fact that this simple "plug-in" estimator is sample-optimal is surprising, even in the constant delta case. Indeed, it was believed that such a tester would not attain sublinear sample complexity even for constant values of epsilon and delta. An important contribution of this work lies in the analysis techniques that we introduce in this context. First, we exploit an underlying strong convexity property to bound from below the expectation gap in the completeness and soundness cases. Second, we give a new, fast method for obtaining provably correct empirical estimates of the true worst-case failure probability for a broad class of uniformity testing statistics over all possible input distributions - including all previously studied statistics for this problem. We believe that our novel analysis techniques will be useful for other distribution testing problems as well.« less
  4. We study the classic set cover problem from the perspective of sub-linear algorithms. Given access to a collection of m sets over n elements in the query model, we show that sub-linear algorithms derived from existing techniques have almost tight query complexities. On one hand, first we show an adaptation of the streaming algorithm presented in [17] to the sub-linear query model, that returns an α-approximate cover using Õ(m(n/k)^1/(α–1) + nk) queries to the input, where k denotes the value of a minimum set cover. We then complement this upper bound by proving that for lower values of k, the required number of queries is , even for estimating the optimal cover size. Moreover, we prove that even checking whether a given collection of sets covers all the elements would require Ω(nk) queries. These two lower bounds provide strong evidence that the upper bound is almost tight for certain values of the parameter k. On the other hand, we show that this bound is not optimal for larger values of the parameter k, as there exists a (1 + ε)-approximation algorithm with Õ(mn/kε^2) queries. We show that this bound is essentially tight for sufficiently small constant ε, by establishing amore »lower bound of query complexity. Our lower-bound results follow by carefully designing two distributions of instances that are hard to distinguish. In particular, our first lower bound involves a probabilistic construction of a certain set system with a minimum set cover of size αk, with the key property that a small number of “almost uniformly distributed” modifications can reduce the minimum set cover size down to k. Thus, these modifications are not detectable unless a large number of queries are asked. We believe that our probabilistic construction technique might find applications to lower bounds for other combinatorial optimization problems.« less
  5. A Boolean {\em $k$-monotone} function defined over a finite poset domain ${\cal D}$ alternates between the values $0$ and $1$ at most $k$ times on any ascending chain in ${\cal D}$. Therefore, $k$-monotone functions are natural generalizations of the classical {\em monotone} functions, which are the {\em $1$-monotone} functions. Motivated by the recent interest in $k$-monotone functions in the context of circuit complexity and learning theory, and by the central role that monotonicity testing plays in the context of property testing, we initiate a systematic study of $k$-monotone functions, in the property testing model. In this model, the goal is to distinguish functions that are $k$-monotone (or are close to being $k$-monotone) from functions that are far from being $k$-monotone. Our results include the following: \begin{enumerate} \item We demonstrate a separation between testing $k$-monotonicity and testing monotonicity, on the hypercube domain $\{0,1\}^d$, for $k\geq 3$; \item We demonstrate a separation between testing and learning on $\{0,1\}^d$, for $k=\omega(\log d)$: testing $k$-monotonicity can be performed with $2^{O(\sqrt d \cdot \log d\cdot \log{1/\eps})}$ queries, while learning $k$-monotone functions requires $2^{\Omega(k\cdot \sqrt d\cdot{1/\eps})}$ queries (Blais et al. (RANDOM 2015)). \item We present a tolerant test for functions $f\colon[n]^d\to \{0,1\}$ with complexity independent ofmore »$n$, which makes progress on a problem left open by Berman et al. (STOC 2014). \end{enumerate} Our techniques exploit the testing-by-learning paradigm, use novel applications of Fourier analysis on the grid $[n]^d$, and draw connections to distribution testing techniques.« less