We study the properties of output distributions of noisy random circuits. We obtain upper and lower bounds on the expected distance of the output distribution from the “useless” uniform distribution. These bounds are tight with respect to the dependence on circuit depth. Our proof techniques also allow us to make statements about the presence or absence of anticoncentration for both noisy and noiseless circuits. We uncover a number of interesting consequences for hardness proofs of sampling schemes that aim to show a quantum computational advantage over classical computation. Specifically, we discuss recent barrier results for depth-agnostic and/or noise-agnostic proof techniques. We show that in certain depth regimes, noise-agnostic proof techniques might still work in order to prove an often-conjectured claim in the literature on quantum computational advantage, contrary to what has been thought prior to this work.
more »
« less
New sampling lower bounds via the separator; Leibniz International Proceedings in Informatics (LIPIcs):38th Computational Complexity Conference (CCC 2023)
Suppose that a target distribution can be approximately sampled by a low-depth decision tree, or more generally by an efficient cell-probe algorithm. It is shown to be possible to restrict the input to the sampler so that its output distribution is still not too far from the target distribution, and at the same time many output coordinates are almost pairwise independent.
This new tool is then used to obtain several new sampling lower bounds and separations, including a separation between AC0 and low-depth decision trees, and a hierarchy theorem for sampling. It is also used to obtain a new proof of the Patrascu-Viola data-structure lower bound for Rank, thereby unifying sampling and data-structure lower bounds.
more »
« less
- Award ID(s):
- 2114116
- PAR ID:
- 10504085
- Editor(s):
- Ta-Shma, Amnon
- Publisher / Repository:
- Schloss Dagstuhl – Leibniz-Zentrum für Informatik
- Date Published:
- Subject(s) / Keyword(s):
- Sampling data structures lower bounds cell probe decision forest AC0 rank predecessor Theory of computation → Circuit complexity
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This paper proves the first super-logarithmic lower bounds on the cell probe complexity of dynamic boolean (a.k.a. decision) data structure problems, a long-standing milestone in data structure lower bounds. We introduce a new approach and use it to prove a Ω(log1.5 n) lower bound on the operational time of a wide range of boolean data structure problems, most notably, on the query time of dynamic range counting over F2. Proving an ω(lgn) lower bound for this problem was explicitly posed as one of five important open problems in the late Mihai Pǎtraşcu’s obituary . This result also implies the first ω(lgn) lower bound for the classical 2D range counting problem, one of the most fundamental data structure problems in computational geometry and spatial databases. We derive similar lower bounds for boolean versions of dynamic polynomial evaluation and 2D rectangle stabbing, and for the (non-boolean) problems of range selection and range median. Our technical centerpiece is a new way of “weakly” simulating dynamic data structures using efficient one-way communication protocols with small advantage over random guessing. This simulation involves a surprising excursion to low-degree (Chebyshev) polynomials which may be of independent interest, and offers an entirely new algorithmic angle on the “cell sampling” method of Panigrahy et al.more » « less
-
We prove several hardness results for training depth-2 neural networks with the ReLU activation function; these networks are simply weighted sums (that may include negative coefficients) of ReLUs. Our goal is to output a depth-2 neural network that minimizes the square loss with respect to a given training set. We prove that this problem is NP-hard already for a network with a single ReLU. We also prove NP-hardness for outputting a weighted sum of k ReLUs minimizing the squared error (for k>1) even in the realizable setting (i.e., when the labels are consistent with an unknown depth-2 ReLU network). We are also able to obtain lower bounds on the running time in terms of the desired additive error ϵ. To obtain our lower bounds, we use the Gap Exponential Time Hypothesis (Gap-ETH) as well as a new hypothesis regarding the hardness of approximating the well known Densest k-Subgraph problem in subexponential time (these hypotheses are used separately in proving different lower bounds). For example, we prove that under reasonable hardness assumptions, any proper learning algorithm for finding the best fitting ReLU must run in time exponential in (1/epsilon)^2. Together with a previous work regarding improperly learning a ReLU (Goel et al., COLT'17), this implies the first separation between proper and improper algorithms for learning a ReLU. We also study the problem of properly learning a depth-2 network of ReLUs with bounded weights giving new (worst-case) upper bounds on the running time needed to learn such networks both in the realizable and agnostic settings. Our upper bounds on the running time essentially matches our lower bounds in terms of the dependency on epsilon.more » « less
-
Given an algorithmic predictor that is accurate on some source population consisting of strategic human decision subjects, will it remain accurate if the population respond to it? In our setting, an agent or a user corresponds to a sample (X,Y) drawn from a distribution and will face a model h and its classification result h(X). Agents can modify X to adapt to h, which will incur a distribution shift on (X,Y). Our formulation is motivated by applications where the deployed machine learning models are subjected to human agents, and will ultimately face responsive and interactive data distributions. We formalize the discussions of the transferability of a model by studying how the performance of the model trained on the available source distribution (data) would translate to the performance on its induced domain. We provide both upper bounds for the performance gap due to the induced domain shift, as well as lower bounds for the trade-offs that a classifier has to suffer on either the source training distribution or the induced target distribution. We provide further instantiated analysis for two popular domain adaptation settings, including covariate shift and target shift.more » « less
-
We study the problem of learning Censored Markov Random Fields (abbreviated CMRFs), which are Markov Random Fields where some of the nodes are censored (i.e. not observed). We assume the CMRF is high temperature but, crucially, make no assumption about its structure. This makes structure learning impossible. Nevertheless we introduce a new definition, which we call learning to sample, that circumvents this obstacle. We give an algorithm that can learn to sample from a distribution within 𝜖𝑛 earthmover distance of the target distribution for any 𝜖>0. We obtain stronger results when we additionally assume high girth, as well as computational lower bounds showing that these are essentially optimal.more » « less