skip to main content


Title: Diffusions interacting through a random matrix: universality via stochastic Taylor expansion
Abstract Consider $$(X_{i}(t))$$ ( X i ( t ) ) solving a system of N stochastic differential equations interacting through a random matrix $${\mathbf {J}} = (J_{ij})$$ J = ( J ij ) with independent (not necessarily identically distributed) random coefficients. We show that the trajectories of averaged observables of $$(X_i(t))$$ ( X i ( t ) ) , initialized from some $$\mu $$ μ independent of  $${\mathbf {J}}$$ J , are universal, i.e., only depend on the choice of the distribution $$\mathbf {J}$$ J through its first and second moments (assuming e.g., sub-exponential tails). We take a general combinatorial approach to proving universality for dynamical systems with random coefficients, combining a stochastic Taylor expansion with a moment matching-type argument. Concrete settings for which our results imply universality include aging in the spherical SK spin glass, and Langevin dynamics and gradient flows for symmetric and asymmetric Hopfield networks.  more » « less
Award ID(s):
1954337
PAR ID:
10349917
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Probability Theory and Related Fields
Volume:
180
Issue:
3-4
ISSN:
0178-8051
Page Range / eLocation ID:
1057 to 1097
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The cumulative empirical spectral measure (CESM) $\Phi[\mathbf{A}] : \mathbb{R} \to [0,1]$ of a $n\times n$ symmetric matrix $\mathbf{A}$ is defined as the fraction of eigenvalues of $\mathbf{A}$ less than a given threshold, i.e., $\Phi[\mathbf{A}](x) := \sum_{i=1}^{n} \frac{1}{n} {\large\unicode{x1D7D9}}[ \lambda_i[\mathbf{A}]\leq x]$. Spectral sums $\operatorname{tr}(f[\mathbf{A}])$ can be computed as the Riemann–Stieltjes integral of $f$ against $\Phi[\mathbf{A}]$, so the task of estimating CESM arises frequently in a number of applications, including machine learning. We present an error analysis for stochastic Lanczos quadrature (SLQ). We show that SLQ obtains an approximation to the CESM within a Wasserstein distance of $t \: | \lambda_{\text{max}}[\mathbf{A}] - \lambda_{\text{min}}[\mathbf{A}] |$ with probability at least $1-\eta$, by applying the Lanczos algorithm for $\lceil 12 t^{-1} + \frac{1}{2} \rceil$ iterations to $\lceil 4 ( n+2 )^{-1}t^{-2} \ln(2n\eta^{-1}) \rceil$ vectors sampled independently and uniformly from the unit sphere. We additionally provide (matrix-dependent) a posteriori error bounds for the Wasserstein and Kolmogorov–Smirnov distances between the output of this algorithm and the true CESM. The quality of our bounds is demonstrated using numerical experiments. 
    more » « less
  2. Abstract

    When k and s are natural numbers and ${\mathbf h}\in {\mathbb Z}^k$, denote by $J_{s,k}(X;\,{\mathbf h})$ the number of integral solutions of the system $$ \sum_{i=1}^s(x_i^j-y_i^j)=h_j\quad (1\leqslant j\leqslant k), $$ with $1\leqslant x_i,y_i\leqslant X$. When $s\lt k(k+1)/2$ and $(h_1,\ldots ,h_{k-1})\ne {\mathbf 0}$, Brandes and Hughes have shown that $J_{s,k}(X;\,{\mathbf h})=o(X^s)$. In this paper we improve on quantitative aspects of this result, and, subject to an extension of the main conjecture in Vinogradov’s mean value theorem, we obtain an asymptotic formula for $J_{s,k}(X;\,{\mathbf h})$ in the critical case $s=k(k+1)/2$. The latter requires minor arc estimates going beyond square-root cancellation.

     
    more » « less
  3. Abstract For $p\geq 1$ and $(g_{ij})_{1\leq i,j\leq n}$ being a matrix of i.i.d. standard Gaussian entries, we study the $n$-limit of the $\ell _p$-Gaussian–Grothendieck problem defined as $$\begin{align*} & \max\Bigl\{\sum_{i,j=1}^n g_{ij}x_ix_j: x\in \mathbb{R}^n,\sum_{i=1}^n |x_i|^p=1\Bigr\}. \end{align*}$$The case $p=2$ corresponds to the top eigenvalue of the Gaussian orthogonal ensemble; when $p=\infty $, the maximum value is essentially the ground state energy of the Sherrington–Kirkpatrick mean-field spin glass model and its limit can be expressed by the famous Parisi formula. In the present work, we focus on the cases $1\leq p<2$ and $2<p<\infty .$ For the former, we compute the limit of the $\ell _p$-Gaussian–Grothendieck problem and investigate the structure of the set of all near optimizers along with stability estimates. In the latter case, we show that this problem admits a Parisi-type variational representation and the corresponding optimizer is weakly delocalized in the sense that its entries vanish uniformly in a polynomial order of $n^{-1}$. 
    more » « less
  4. Abstract

    We propose a new approach to deriving quantitative mean field approximations for any probability measure $P$ on $\mathbb {R}^{n}$ with density proportional to $e^{f(x)}$, for $f$ strongly concave. We bound the mean field approximation for the log partition function $\log \int e^{f(x)}dx$ in terms of $\sum _{i \neq j}\mathbb {E}_{Q^{*}}|\partial _{ij}f|^{2}$, for a semi-explicit probability measure $Q^{*}$ characterized as the unique mean field optimizer, or equivalently as the minimizer of the relative entropy $H(\cdot \,|\,P)$ over product measures. This notably does not involve metric-entropy or gradient-complexity concepts which are common in prior work on nonlinear large deviations. Three implications are discussed, in the contexts of continuous Gibbs measures on large graphs, high-dimensional Bayesian linear regression, and the construction of decentralized near-optimizers in high-dimensional stochastic control problems. Our arguments are based primarily on functional inequalities and the notion of displacement convexity from optimal transport.

     
    more » « less
  5. Over the last two decades, a significant line of work in theoretical algorithms has made progress in solving linear systems of the form $\mathbf{L}\mathbf{x} = \mathbf{b}$, where $\mathbf{L}$ is the Laplacian matrix of a weighted graph with weights $w(i,j)>0$ on the edges. The solution $\mathbf{x}$ of the linear system can be interpreted as the potentials of an electrical flow in which the resistance on edge $(i,j)$ is $1/w(i,j)$. Kelner, Orrechia, Sidford, and Zhu \cite{KOSZ13} give a combinatorial, near-linear time algorithm that maintains the Kirchoff Current Law, and gradually enforces the Kirchoff Potential Law by updating flows around cycles ({\it cycle toggling}). In this paper, we consider a dual version of the algorithm that maintains the Kirchoff Potential Law, and gradually enforces the Kirchoff Current Law by {\it cut toggling}: each iteration updates all potentials on one side of a fundamental cut of a spanning tree by the same amount. We prove that this dual algorithm also runs in a near-linear number of iterations. We show, however, that if we abstract cut toggling as a natural data structure problem, this problem can be reduced to the online vector-matrix-vector problem (OMv), which has been conjectured to be difficult for dynamic algorithms \cite{HKNS15}. The conjecture implies that the data structure does not have an $O(n^{1-\epsilon})$ time algorithm for any $\epsilon > 0$, and thus a straightforward implementation of the cut-toggling algorithm requires essentially linear time per iteration. To circumvent the lower bound, we batch update steps, and perform them simultaneously instead of sequentially. An appropriate choice of batching leads to an $\widetilde{O}(m^{1.5})$ time cut-toggling algorithm for solving Laplacian systems. Furthermore, we show that if we sparsify the graph and call our algorithm recursively on the Laplacian system implied by batching and sparsifying, we can reduce the running time to $O(m^{1 + \epsilon})$ for any $\epsilon > 0$. Thus, the dual cut-toggling algorithm can achieve (almost) the same running time as its primal cycle-toggling counterpart. 
    more » « less