skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Low-Depth Arithmetic Circuit Lower Bounds: Bypassing Set-Multilinearization
A recent breakthrough work of Limaye, Srinivasan and Tavenas [Nutan Limaye et al., 2021] proved superpolynomial lower bounds for low-depth arithmetic circuits via a "hardness escalation" approach: they proved lower bounds for low-depth set-multilinear circuits and then lifted the bounds to low-depth general circuits. In this work, we prove superpolynomial lower bounds for low-depth circuits by bypassing the hardness escalation, i.e., the set-multilinearization, step. As set-multilinearization comes with an exponential blow-up in circuit size, our direct proof opens up the possibility of proving an exponential lower bound for low-depth homogeneous circuits by evading a crucial bottleneck. Our bounds hold for the iterated matrix multiplication and the Nisan-Wigderson design polynomials. We also define a subclass of unrestricted depth homogeneous formulas which we call unique parse tree (UPT) formulas, and prove superpolynomial lower bounds for these. This significantly generalizes the superpolynomial lower bounds for regular formulas [Neeraj Kayal et al., 2014; Hervé Fournier et al., 2015].  more » « less
Award ID(s):
2152413
PAR ID:
10484523
Author(s) / Creator(s):
; ; ; ;
Editor(s):
Etessami, Kousha; Feige, Uriel; Puppis, Gabriele
Publisher / Repository:
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Date Published:
Journal Name:
50th International Colloquium on Automata, Languages, and Programming
Subject(s) / Keyword(s):
arithmetic circuits low-depth circuits lower bounds shifted partials Theory of computation → Algebraic complexity theory
Format(s):
Medium: X
Location:
Paderborn, Germany
Sponsoring Org:
National Science Foundation
More Like this
  1. Santhanam, Rahul (Ed.)
    The recent breakthrough of Limaye, Srinivasan and Tavenas [Limaye et al., 2022] (LST) gave the first super-polynomial lower bounds against low-depth algebraic circuits, for any field of zero (or sufficiently large) characteristic. It was an open question to extend this result to small-characteristic ([Limaye et al., 2022; Govindasamy et al., 2022; Fournier et al., 2023]), which in particular is relevant for an approach to prove superpolynomial AC⁰[p]-Frege lower bounds ([Govindasamy et al., 2022]). In this work, we prove super-polynomial algebraic circuit lower bounds against low-depth algebraic circuits over any field, with the same parameters as LST (or even matching the improved parameters of Bhargav, Dutta, and Saxena [Bhargav et al., 2022]). We give two proofs. The first is logical, showing that even though the proof of LST naively fails in small characteristic, the proof is sufficiently algebraic that generic transfer results imply the result over characteristic zero implies the result over all fields. Motivated by this indirect proof, we then proceed to give a second constructive proof, replacing the field-dependent set-multilinearization result of LST with a set-multilinearization that works over any field, by using the Binet-Minc identity [Minc, 1979]. 
    more » « less
  2. We prove several hardness results for training depth-2 neural networks with the ReLU activation function; these networks are simply weighted sums (that may include negative coefficients) of ReLUs. Our goal is to output a depth-2 neural network that minimizes the square loss with respect to a given training set. We prove that this problem is NP-hard already for a network with a single ReLU. We also prove NP-hardness for outputting a weighted sum of k ReLUs minimizing the squared error (for k>1) even in the realizable setting (i.e., when the labels are consistent with an unknown depth-2 ReLU network). We are also able to obtain lower bounds on the running time in terms of the desired additive error ϵ. To obtain our lower bounds, we use the Gap Exponential Time Hypothesis (Gap-ETH) as well as a new hypothesis regarding the hardness of approximating the well known Densest k-Subgraph problem in subexponential time (these hypotheses are used separately in proving different lower bounds). For example, we prove that under reasonable hardness assumptions, any proper learning algorithm for finding the best fitting ReLU must run in time exponential in (1/epsilon)^2. Together with a previous work regarding improperly learning a ReLU (Goel et al., COLT'17), this implies the first separation between proper and improper algorithms for learning a ReLU. We also study the problem of properly learning a depth-2 network of ReLUs with bounded weights giving new (worst-case) upper bounds on the running time needed to learn such networks both in the realizable and agnostic settings. Our upper bounds on the running time essentially matches our lower bounds in terms of the dependency on epsilon. 
    more » « less
  3. Guruswami, Venkatesan (Ed.)
    A fundamental problem in circuit complexity is to find explicit functions that require large depth to compute. When considering the natural DeMorgan basis of {OR,AND}, where negations incur no cost, the best known depth lower bounds for an explicit function in NP have the form (3-o(1))log₂ n, established by Håstad (building on others) in the early 1990s. We make progress on the problem of improving this factor of 3, in two different ways: - We consider an "algorithmic method" approach to proving stronger depth lower bounds for non-uniform circuits in the DeMorgan basis. We show that slightly faster algorithms (than what is known) for counting the number of satisfying assignments on subcubic-size DeMorgan formulas would imply supercubic-size DeMorgan formula lower bounds, implying that the depth must be at least (3+ε)log₂ n for some ε > 0. For example, if #SAT on formulas of size n^{2+2ε} can be solved in 2^{n - n^{1-ε}log^k n} time for some ε > 0 and a sufficiently large constant k, then there is a function computable in 2^{O(n)} time with a SAT oracle which does not have n^{3+ε}-size formulas. In fact, the #SAT algorithm only has to work on formulas that are a conjunction of n^{1-ε} subformulas, each of which is n^{1+3ε} size, in order to obtain the supercubic lower bound. As a proof of concept, we show that our new algorithms-to-lower-bounds connection can be applied to prove new lower bounds for "hybrid" DeMorgan formula models which compute interesting functions at their leaves. - Turning to the {NAND} basis, we establish a greater-than-(3 log₂ n) depth lower bound against uniform circuits solving the SAT problem, using an extension of the "indirect diagonalization" method for NAND formulas. Note that circuits over the NAND basis are a special case of circuits over the DeMorgan basis; however, hard functions such as Andreev’s function (known to require depth (3-o(1))log₂ n in the DeMorgan basis) can still be computed with NAND circuits of depth (3+o(1))log₂ n. Our results imply that SAT requires polylogtime-uniform NAND circuits of depth at least 3.603 log₂ n. 
    more » « less
  4. We give superpolynomial statistical query (SQ) lower bounds for learning two-hidden-layer ReLU networks with respect to Gaussian inputs in the standard (noise-free) model. No general SQ lower bounds were known for learning ReLU networks of any depth in this setting: previous SQ lower bounds held only for adversarial noise models (agnostic learning) or restricted models such as correlational SQ. Prior work hinted at the impossibility of our result: Vempala and Wilmes showed that general SQ lower bounds cannot apply to any real-valued family of functions that satisfies a simple non-degeneracy condition. To circumvent their result, we refine a lifting procedure due to Daniely and Vardi that reduces Boolean PAC learning problems to Gaussian ones. We show how to extend their technique to other learning models and, in many well-studied cases, obtain a more efficient reduction. As such, we also prove new cryptographic hardness results for PAC learning two-hidden-layer ReLU networks, as well as new lower bounds for learning constant-depth ReLU networks from label queries. 
    more » « less
  5. The classical Beardwood-Halton-Hammersly theorem (1959) asserts the existence of an asymptotic formula of the form $$\beta \sqrt n$$ for the minimum length of a Traveling Salesperson Tour throuh $$n$$ random points in the unit square, and in the decades since it was proved, the existence of such formulas has been shown for other such \emph{Euclidean functionals} on random points in the unit square as well. Despite more than 50 years of attention, however, it remained unknown whether the minimum length TSP through $$n$$ random points in $[0,1]^2$ was asymptotically distinct from its natural lower bounds, such as the minimum length spanning tree, the minimum length 2-factor, or, as raised by Goemans and Bertsimas, from its linear programming relaxation. We prove that the TSP on random points in Euclidean space is indeed asymptotically distinct from these and other natural lower bounds, and show that this separation implies that branch-and-bound algorithms based on these natural lower bounds must take nearly exponential ($$e^{\tilde \Omega(n)}$$) time to solve the TSP to optimality, even in average case. This is the first average-case superpolynomial lower bound for these branch-and-bound algorithms (a lower bound as strong as $$e^{\tilde \Omega (n)}$$ was not even been known in worst-case analysis). 
    more » « less