skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Identification of mixtures of discrete product distributions in near-optimal sample and time complexity
We consider the problem of \emph{identifying,} from statistics, a distribution of discrete random variables $$X_1 \ldots,X_n$$ that is a mixture of $$k$$ product distributions. The best previous sample complexity for $$n \in O(k)$$ was $$(1/\zeta)^{O(k^2 \log k)}$$ (under a mild separation assumption parameterized by $$\zeta$$). The best known lower bound was $$\exp(\Omega(k))$$. It is known that $$n\geq 2k-1$$ is necessary and sufficient for identification. We show, for any $$n\geq 2k-1$$, how to achieve sample complexity and run-time complexity $$(1/\zeta)^{O(k)}$$. We also extend the known lower bound of $$e^{\Omega(k)}$$ to match our upper bound across a broad range of $$\zeta$$. Our results are obtained by combining (a) a classic method for robust tensor decomposition, (b) a novel way of bounding the condition number of key matrices called Hadamard extensions, by studying their action only on flattened rank-1 tensors.  more » « less
Award ID(s):
2321079
PAR ID:
10608820
Author(s) / Creator(s):
; ; ; ;
Editor(s):
Agrawal, Shipra; Roth, Aaron
Publisher / Repository:
Proceedings of Machine Learning Research
Date Published:
Volume:
247
ISSN:
2640-3498
Page Range / eLocation ID:
2071-2091
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We prove two new results about the inability of low-degree polynomials to uniformly approximate constant-depth circuits, even to slightly-better-than-trivial error. First, we prove a tight Omega~(n^{1/2}) lower bound on the threshold degree of the SURJECTIVITY function on n variables. This matches the best known threshold degree bound for any AC^0 function, previously exhibited by a much more complicated circuit of larger depth (Sherstov, FOCS 2015). Our result also extends to a 2^{Omega~(n^{1/2})} lower bound on the sign-rank of an AC^0 function, improving on the previous best bound of 2^{Omega(n^{2/5})} (Bun and Thaler, ICALP 2016). Second, for any delta>0, we exhibit a function f : {-1,1}^n -> {-1,1} that is computed by a circuit of depth O(1/delta) and is hard to approximate by polynomials in the following sense: f cannot be uniformly approximated to error epsilon=1-2^{-Omega(n^{1-delta})}, even by polynomials of degree n^{1-delta}. Our recent prior work (Bun and Thaler, FOCS 2017) proved a similar lower bound, but which held only for error epsilon=1/3. Our result implies 2^{Omega(n^{1-delta})} lower bounds on the complexity of AC^0 under a variety of basic measures such as discrepancy, margin complexity, and threshold weight. This nearly matches the trivial upper bound of 2^{O(n)} that holds for every function. The previous best lower bound on AC^0 for these measures was 2^{Omega(n^{1/2})} (Sherstov, FOCS 2015). Additional applications in learning theory, communication complexity, and cryptography are described. 
    more » « less
  2. We prove an Omega(n^{1−1/k} log k /2^k) lower bound on the k-party number-in-hand communication complexity of collision-finding. This implies a 2^{n^{1−o(1)}} lower bound on the size of tree-like cutting-planes proofs of the bit pigeonhole principle, a compact and natural propositional encoding of the pigeonhole principle, improving on the best previous lower bound of 2^{Omega(sqrt{n})}. 
    more » « less
  3. null (Ed.)
    The approximate degree of a Boolean function f is the least degree of a real polynomial that approximates f pointwise to error at most 1/3. The approximate degree of f is known to be a lower bound on the quantum query complexity of f (Beals et al., FOCS 1998 and J. ACM 2001). We find tight or nearly tight bounds on the approximate degree and quantum query complexities of several basic functions. Specifically, we show the following. k-Distinctness: For any constant k, the approximate degree and quantum query complexity of the k-distinctness function is Ω(n3/4−1/(2k)). This is nearly tight for large k, as Belovs (FOCS 2012) has shown that for any constant k, the approximate degree and quantum query complexity of k-distinctness is O(n3/4−1/(2k+2−4)). Image size testing: The approximate degree and quantum query complexity of testing the size of the image of a function [n]→[n] is Ω~(n1/2). This proves a conjecture of Ambainis et al. (SODA 2016), and it implies tight lower bounds on the approximate degree and quantum query complexity of the following natural problems. k-Junta testing: A tight Ω~(k1/2) lower bound for k-junta testing, answering the main open question of Ambainis et al. (SODA 2016). Statistical distance from uniform: A tight Ω~(n1/2) lower bound for approximating the statistical distance of a distribution from uniform, answering the main question left open by Bravyi et al. (STACS 2010 and IEEE Trans. Inf. Theory 2011). Shannon entropy: A tight Ω~(n1/2) lower bound for approximating Shannon entropy up to a certain additive constant, answering a question of Li and Wu (2017). Surjectivity: The approximate degree of the surjectivity function is Ω~(n3/4). The best prior lower bound was Ω(n2/3). Our result matches an upper bound of O~(n3/4) due to Sherstov (STOC 2018), which we reprove using different techniques. The quantum query complexity of this function is known to be Θ(n) (Beame and Machmouchi, Quantum Inf. Comput. 2012 and Sherstov, FOCS 2015). Our upper bound for surjectivity introduces new techniques for approximating Boolean functions by low-degree polynomials. Our lower bounds are proved by significantly refining techniques recently introduced by Bun and Thaler (FOCS 2017). 
    more » « less
  4. Dirac proved that each $$n$$-vertex $$2$$-connected graph with minimum degree $$k$$ contains a cycle of length at least $$\min\{2k, n\}$$. We obtain analogous results for Berge cycles in hypergraphs. Recently, the authors proved an exact lower bound on the minimum degree ensuring a Berge cycle of length at least $$\min\{2k, n\}$$ in $$n$$-vertex $$r$$-uniform $$2$$-connected hypergraphs when $$k \geq r+2$$. In this paper we address the case $$k \leq r+1$$ in which the bounds have a different behavior. We prove that each $$n$$-vertex $$r$$-uniform $$2$$-connected hypergraph $$H$$ with minimum degree $$k$$ contains a Berge cycle of length at least $$\min\{2k,n,|E(H)|\}$$. If $$|E(H)|\geq n$$, this bound coincides with the bound of the Dirac's Theorem for 2-connected graphs. 
    more » « less
  5. Gowers, Tim (Ed.)
    We prove that for $$d\geq 0$$ and $$k\geq 2$$, for any subset $$A$$ of a discrete cube $$\{0,1\}^d$$, the $k-$higher energy of $$A$$ (i.e., the number of $2k-$tuples $$(a_1,a_2,\dots,a_{2k})$$ in $$A^{2k}$$ with $$a_1-a_2=a_3-a_4=\dots=a_{2k-1}-a_{2k}$$) is at most $$|A|^{\log_{2}(2^k+2)}$$, and $$\log_{2}(2^k+2)$$ is the best possible exponent. We also show that if $$d\geq 0$$ and $$2\leq k\leq 10$$, for any subset $$A$$ of a discrete cube $$\{0,1\}^d$$, the $k-$additive energy of $$A$$ (i.e., the number of $2k-$tuples $$(a_1,a_2,\dots,a_{2k})$$ in $$A^{2k}$$ with $$a_1+a_2+\dots+a_k=a_{k+1}+a_{k+2}+\dots+a_{2k}$$) is at most $$|A|^{\log_2{ \binom{2k}{k}}}$$, and $$\log_2{ \binom{2k}{k}}$$ is the best possible exponent. We discuss the analogous problems for the sets $$\{0,1,\dots,n\}^d$$ for $$n\geq2$$. 
    more » « less