We study the problem of testing identity against a given distribution with a focus on the high confidence regime. More precisely, given samples from an unknown distribution p over n elements, an explicitly given distribution q, and parameters 0< epsilon, delta < 1, we wish to distinguish, with probability at least 1-delta, whether the distributions are identical versus epsilon-far in total variation distance. Most prior work focused on the case that delta = Omega(1), for which the sample complexity of identity testing is known to be Theta(sqrt{n}/epsilon^2). Given such an algorithm, one can achieve arbitrarily small values of delta viamore »
Optimal Stopping Rules for Sequential Hypothesis Testing
Suppose that we are given sample access to an unknown distribution p over n elements and an explicit distribution q over the same n elements. We would like to reject the null hypothesis“p=q” after seeing as few samples as possible, whenp6=q, while we never want to reject the null, when p=q. Well-known results show thatΘ(√n/2)samples are necessary and sufficient for distinguishing whether p equals q versus p is-far from q in total variation distance. However,this requires the distinguishing radiusto be fixed prior to deciding how many samples to request.Our goal is instead to design sequential hypothesis testers, i.e. online algorithms that request i.i.d.samples from p and stop as soon as they can confidently reject the hypothesis p=q, without being given a lower bound on the distance between p and q, whenp6=q. In particular, we want to minimize the number of samples requested by our tests as a function of the distance between p and q, and if p=q we want the algorithm, with high probability, to never reject the null. Our work is motivated by and addresses the practical challenge of sequential A/B testing in Statistics.We show that, when n= 2, any sequential hypothesis test must seeΩ(1dtv(p,q)2log log1dtv(p,q))samples, with high more »
- Award ID(s):
- 1650733
- Publication Date:
- NSF-PAR ID:
- 10086307
- Journal Name:
- 25th Annual European Symposium on Algorithms (ESA)
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We show that the square Hellinger distance between two Bayesian networks on the same directed graph, G, is subadditive with respect to the neighborhoods of G. Namely, if P and Q are the probability distributions defined by two Bayesian networks on the same DAG, our inequality states that the square Hellinger distance, H2(P,Q), between P and Q is upper bounded by the sum, ∑vH2(P{v}∪Πv,Q{v}∪Πv), of the square Hellinger distances between the marginals of P and Q on every node v and its parents Πv in the DAG. Importantly, our bound does not involve the conditionals but the marginals of Pmore »
-
We design a nonadaptive algorithm that, given a Boolean function f: {0, 1}^n → {0, 1} which is α-far from monotone, makes poly(n, 1/α) queries and returns an estimate that, with high probability, is an O-tilde(\sqrt{n})-approximation to the distance of f to monotonicity. Furthermore, we show that for any constant k > 0, approximating the distance to monotonicity up to n^(1/2−k)-factor requires 2^{n^k} nonadaptive queries, thereby ruling out a poly(n, 1/α)-query nonadaptive algorithm for such approximations. This answers a question of Seshadhri (Property Testing Review, 2014) for the case of nonadaptive algorithms. Approximating the distance to a property is closelymore »
-
There has been significant study on the sample complexity of testing properties of distributions over large domains. For many properties, it is known that the sample complexity can be substantially smaller than the domain size. For example, over a domain of size n, distinguishing the uniform distribution from distributions that are far from uniform in ℓ1-distance uses only O(n−−√) samples. However, the picture is very different in the presence of arbitrary noise, even when the amount of noise is quite small. In this case, one must distinguish if samples are coming from a distribution that is ϵ-close to uniform frommore »
-
There has been significant study on the sample complexity of testing properties of distributions over large domains. For many properties, it is known that the sample complexity can be substantially smaller than the domain size. For example, over a domain of size n, distinguishing the uniform distribution from distributions that are far from uniform in ℓ1-distance uses only O(n−−√) samples. However, the picture is very different in the presence of arbitrary noise, even when the amount of noise is quite small. In this case, one must distinguish if samples are coming from a distribution that is ϵ-close to uniform frommore »