skip to main content


Title: Gardner formula for Ising perceptron models at small densities
We consider the Ising perceptron model with N spins and M = N*alpha patterns, with a general activation function U that is bounded above. For U bounded away from zero, or U a one-sided threshold function, it was shown by Talagrand (2000, 2011) that for small densities alpha, the free energy of the model converges in the large-N limit to the replica symmetric formula conjectured in the physics literature (Krauth–Mezard 1989, see also Gardner–Derrida 1988). We give a new proof of this result, which covers the more general class of all functions U that are bounded above and satisfy a certain variance bound. The proof uses the (first and second) moment method conditional on the approximate message passing iterates of the model. In order to deduce our main theorem, we also prove a new concentration result for the perceptron model in the case where U is not bounded away from zero.  more » « less
Award ID(s):
2031883
NSF-PAR ID:
10344272
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Conference on Learning Theory
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Let Ω ⊂ ℝ n + 1 {\Omega\subset\mathbb{R}^{n+1}} , n ≥ 2 {n\geq 2} , be a 1-sided non-tangentially accessible domain (also known as uniform domain), that is, Ω satisfies the interior Corkscrew and Harnack chain conditions, which are respectively scale-invariant/quantitative versions of openness and path-connectedness. Let us assume also that Ω satisfies the so-called capacity density condition, a quantitative version of the fact that all boundary points are Wiener regular. Consider two real-valued (non-necessarily symmetric) uniformly elliptic operators L 0 ⁢ u = - div ⁡ ( A 0 ⁢ ∇ ⁡ u )   and   L ⁢ u = - div ⁡ ( A ⁢ ∇ ⁡ u ) L_{0}u=-\operatorname{div}(A_{0}\nabla u)\quad\text{and}\quad Lu=-%\operatorname{div}(A\nabla u) in Ω, and write ω L 0 {\omega_{L_{0}}} and ω L {\omega_{L}} for the respective associated elliptic measures. The goal of this article and its companion[M. Akman, S. Hofmann, J. M. Martell and T. Toro,Perturbation of elliptic operators in 1-sided NTA domains satisfying the capacity density condition,preprint 2021, https://arxiv.org/abs/1901.08261v3 ]is to find sufficient conditions guaranteeing that ω L {\omega_{L}} satisfies an A ∞ {A_{\infty}} -condition or a RH q {\operatorname{RH}_{q}} -condition with respect to ω L 0 {\omega_{L_{0}}} . In this paper, we are interested in obtaininga square function and non-tangential estimates for solutions of operators as before. We establish that bounded weak null-solutions satisfy Carleson measure estimates, with respect to the associated elliptic measure. We also show that for every weak null-solution, the associated square function can be controlled by the non-tangential maximal function in any Lebesgue space with respect to the associated elliptic measure. These results extend previous work ofDahlberg, Jerison and Kenig and are fundamental for the proof of the perturbation results in the paper cited above. 
    more » « less
  2. Extremal combinatorics often deals with problems of maximizing a specific quantity related to substructures in large discrete structures. The first question of this kind that comes to one's mind is perhaps determining the maximum possible number of induced subgraphs isomorphic to a fixed graph $H$ in an $n$-vertex graph. The asymptotic behavior of this number is captured by the limit of the ratio of the maximum number of induced subgraphs isomorphic to $H$ and the number of all subgraphs with the same number vertices as $H$; this quantity is known as the _inducibility_ of $H$. More generally, one can define the inducibility of a family of graphs in the analogous way.Among all graphs with $k$ vertices, the only two graphs with inducibility equal to one are the empty graph and the complete graph. However, how large can the inducibility of other graphs with $k$ vertices be? Fix $k$, consider a graph with $n$ vertices join each pair of vertices independently by an edge with probability $\binom{k}{2}^{-1}$. The expected number of $k$-vertex induced subgraphs with exactly one edge is $e^{-1}+o(1)$. So, the inducibility of large graphs with a single edge is at least $e^{-1}+o(1)$. This article establishes that this bound is the best possible in the following stronger form, which proves a conjecture of Alon, Hefetz, Krivelevich and Tyomkyn: the inducibility of the family of $k$-vertex graphs with exactly $l$ edges where $0 more » « less
  3. Real-time decision making in IoT applications relies upon space-efficient evaluation of queries over streaming data. To model the uncertainty in the classification of data being processed, we consider the model of probabilistic strings --- sequences of discrete probability distributions over a finite set of events, and initiate the study of space complexity of streaming computation for different classes of queries over such probabilistic strings. We first consider the problem of computing the probability that a word, sampled from the distribution defined by the probabilistic string read so far, is accepted by a given deterministic finite automaton. We show that this regular pattern matching problem can be solved using space that is only poly-logarithmic in the string length (and polynomial in the size of the DFA) if we are allowed a multiplicative approximation error. Then we show how to generalize this result to quantitative queries specified by additive cost register automata --- these are automata that map strings to numerical values using finite control and registers that get updated using linear transformations. Finally, we consider the case when updates in such an automaton involve tests, and in particular, when there is a counter variable that can be either incremented or decremented but decrements only apply when the counter value is non-zero. In this case, the desired answer depends on the probability distribution over the set of possible counter values that can range from 0 to n for a string of length n. Under a mild assumption, namely probabilities of the individual events are bounded away from 0 and 1, we show that there is an algorithm that can compute all n entries of this probability distribution vector to within additive 1/poly(n) error using space that is only Õ(n). In establishing these results, we introduce several new technical ideas that may prove useful for designing space-efficient algorithms for other query models over probabilistic strings. 
    more » « less
  4. null (Ed.)
    We establish a relation between the "large r" asymptotics of the Turaev-Viro invariants $TV_r $and the Gromov norm of 3-manifolds. We show that for any orientable, compact 3-manifold $M$, with (possibly empty) toroidal boundary, $log|TVr(M)|$ is bounded above by a function linear in $r$ and whose slope is a positive universal constant times the Gromov norm of $M$. The proof combines TQFT techniques, geometric decomposition theory of 3-manifolds and analytical estimates of $6j$-symbols. We obtain topological criteria that can be used to check whether the growth is actually exponential; that is one has $log|TVr(M)|\geq B r$, for some $B>0$. We use these criteria to construct infinite families of hyperbolic 3-manifolds whose $SO(3)$- Turaev-Viro invariants grow exponentially. These constructions are essential for the results of article [3] where we make progress on a conjecture of Andersen, Masbaum and Ueno about the geometric properties of surface mapping class groups detected by the quantum representations. We also study the behavior of the Turaev-Viro invariants under cutting and gluing of 3-manifolds along tori. In particular, we show that, like the Gromov norm, the values of the invariants do not increase under Dehn filling and we give applications of this result on the question of the extent to which relations between the invariants TVr and hyperbolic volume are preserved under Dehn filling. Finally we give constructions of 3-manifolds, both with zero and non-zero Gromov norm, for which the Turaev-Viro invariants determine the Gromov norm. 
    more » « less
  5. Abstract

    In a Merlin–Arthur proof system, the proof verifier (Arthur) accepts valid proofs (from Merlin) with probability 1, and rejects invalid proofs with probability arbitrarily close to 1. The running time of such a system is defined to be the length of Merlin’s proof plus the running time of Arthur. We provide new Merlin–Arthur proof systems for some key problems in fine-grained complexity. In several cases our proof systems have optimal running time. Our main results include:

    Certifying that a list ofnintegers has no 3-SUM solution can be done in Merlin–Arthur time$$\tilde{O}(n)$$O~(n). Previously, Carmosino et al. [ITCS 2016] showed that the problem has a nondeterministic algorithm running in$$\tilde{O}(n^{1.5})$$O~(n1.5)time (that is, there is a proof system with proofs of length$$\tilde{O}(n^{1.5})$$O~(n1.5)and a deterministic verifier running in$$\tilde{O}(n^{1.5})$$O~(n1.5)time).

    Counting the number ofk-cliques with total edge weight equal to zero in ann-node graph can be done in Merlin–Arthur time$${\tilde{O}}(n^{\lceil k/2\rceil })$$O~(nk/2)(where$$k\ge 3$$k3). For oddk, this bound can be further improved for sparse graphs: for example, counting the number of zero-weight triangles in anm-edge graph can be done in Merlin–Arthur time$${\tilde{O}}(m)$$O~(m). Previous Merlin–Arthur protocols by Williams [CCC’16] and Björklund and Kaski [PODC’16] could only countk-cliques in unweighted graphs, and had worse running times for smallk.

    Computing the All-Pairs Shortest Distances matrix for ann-node graph can be done in Merlin–Arthur time$$\tilde{O}(n^2)$$O~(n2). Note this is optimal, as the matrix can have$$\Omega (n^2)$$Ω(n2)nonzero entries in general. Previously, Carmosino et al. [ITCS 2016] showed that this problem has an$$\tilde{O}(n^{2.94})$$O~(n2.94)nondeterministic time algorithm.

    Certifying that ann-variablek-CNF is unsatisfiable can be done in Merlin–Arthur time$$2^{n/2 - n/O(k)}$$2n/2-n/O(k). We also observe an algebrization barrier for the previous$$2^{n/2}\cdot \textrm{poly}(n)$$2n/2·poly(n)-time Merlin–Arthur protocol of R. Williams [CCC’16] for$$\#$$#SAT: in particular, his protocol algebrizes, and we observe there is no algebrizing protocol fork-UNSAT running in$$2^{n/2}/n^{\omega (1)}$$2n/2/nω(1)time. Therefore we have to exploit non-algebrizing properties to obtain our new protocol.

    Certifying a Quantified Boolean Formula is true can be done in Merlin–Arthur time$$2^{4n/5}\cdot \textrm{poly}(n)$$24n/5·poly(n). Previously, the only nontrivial result known along these lines was an Arthur–Merlin–Arthur protocol (where Merlin’s proof depends on some of Arthur’s coins) running in$$2^{2n/3}\cdot \textrm{poly}(n)$$22n/3·poly(n)time.

    Due to the centrality of these problems in fine-grained complexity, our results have consequences for many other problems of interest. For example, our work implies that certifying there is no Subset Sum solution tonintegers can be done in Merlin–Arthur time$$2^{n/3}\cdot \textrm{poly}(n)$$2n/3·poly(n), improving on the previous best protocol by Nederlof [IPL 2017] which took$$2^{0.49991n}\cdot \textrm{poly}(n)$$20.49991n·poly(n)time.

     
    more » « less