Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

Free, publiclyaccessible full text available July 21, 2025

Total variation distance (TV distance) is a fundamental notion of distance between probability distributions. In this work, we introduce and study the problem of computing the TV distance of two product distributions over the domain {0,1}^n. In particular, we establish the following results.1. The problem of exactly computing the TV distance of two product distributions is #Pcomplete. This is in stark contrast with other distance measures such as KL, Chisquare, and Hellinger which tensorize over the marginals leading to efficient algorithms.2. There is a fully polynomialtime deterministic approximation scheme (FPTAS) for computing the TV distance of two product distributions P and Q where Q is the uniform distribution. This result is extended to the case where Q has a constant number of distinct marginals. In contrast, we show that when P and Q are Bayes net distributions the relative approximation of their TV distance is NPhard.

null (Ed.)We study the problems of identity and closeness testing of ndimensional product distributions. Prior works of Canonne et al. (2017) and Daskalakis and Pan (2017) have established tight sample complexity bounds for nontolerant testing over a binary alphabet: given two product distributions P and Q over a binary alphabet, distinguish between the cases P = Q and dTV(P;Q) > epsilon . We build on this prior work to give a more comprehensive map of the complexity of testing of product distributions by investigating tolerant testing with respect to several natural distance measures and over an arbitrary alphabet. Our study gives a finegrained understanding of how the sample complexity of tolerant testing varies with the distance measures for product distributions. In addition, we also extend one of our upper bounds on product distributions to boundeddegree Bayes nets.more » « less

Larochelle, Hugo ; Ranzato, Marc'Aurelio ; Hadsell, Raia ; Balcan, MariaFlorina ; Lin, HsuanTien (Ed.)

Pass, Rafael Pass ; Pietrzak, Krzysztof Pietrzak (Ed.)We investigate the complexity of problems that admit perfect zeroknowledge interactive protocols and establish new unconditional upper bounds and oracle separation results. We establish our results by investigating certain {\em distribution testing problems}: computational problems over highdimensional distributions represented by succinct Boolean circuits. A relatively lessinvestigated complexity class $\SBP$ emerged as significant in this study. The main results we establish are: 1. A unconditional inclusion that NIPZK is a subset of CoSBP. 2. Construction of a relativized world in which there is a distribution testing problem that lies in NIPZK but not in SBP, thus giving a relativized separation of NIPZK (and hence PZK) from SBP. 3. Construction of a relativized world in which there is a distribution testing problem that lies in PZK but not in CoSBP, thus giving a relativized separation of PZK from CoSBP.. Results (1) and (3) imply an oracle separating PZK from NIPZK. Our results refine the landscape of perfect zeroknowledge classes in relation to traditional complexity classes.more » « less

We study the problem of efficiently estimating the effect of an intervention on a single variable using observational samples. Our goal is to give algorithms with polynomial time and sample complexity in a nonparametric setting. Tian and Pearl (AAAI ’02) have exactly characterized the class of causal graphs for which causal effects of atomic interventions can be identified from observational data. We make their result quantitative. Suppose 𝒫 is a causal model on a set V of n observable variables with respect to a given causal graph G, and let do(x) be an identifiable intervention on a variable X. We show that assuming that G has bounded indegree and bounded ccomponents (k) and that the observational distribution satisfies a strong positivity condition: (i) [Evaluation] There is an algorithm that outputs with probability 2/3 an evaluator for a distribution P^ that satisfies TV(P(V  do(x)), P^(V)) < eps using m=O (n/eps^2) samples from P and O(mn) time. The evaluator can return in O(n) time the probability P^(v) for any assignment v to V. (ii) [Sampling] There is an algorithm that outputs with probability 2/3 a sampler for a distribution P^ that satisfies TV(P(V  do(x)), P^(V)) < eps using m=O (n/eps^2) samples from P and O(mn) time. The sampler returns an iid sample from P^ with probability 1 in O(n) time. We extend our techniques to estimate P(Y  do(x)) for a subset Y of variables of interest. We also show lower bounds for the sample complexity, demonstrating that our sample complexity has optimal dependence on the parameters n and eps, as well as if k=1 on the strong positivity parameter.more » « less