skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 1712730

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Lee, James R (Ed.)
    A conjecture of Hopkins (2018) posits that for certain high-dimensional hypothesis testing problems, no polynomial-time algorithm can outperform so-called "simple statistics", which are low-degree polynomials in the data. This conjecture formalizes the beliefs surrounding a line of recent work that seeks to understand statistical-versus-computational tradeoffs via the low-degree likelihood ratio. In this work, we refute the conjecture of Hopkins. However, our counterexample crucially exploits the specifics of the noise operator used in the conjecture, and we point out a simple way to modify the conjecture to rule out our counterexample. We also give an example illustrating that (even after the above modification), the symmetry assumption in the conjecture is necessary. These results do not undermine the low-degree framework for computational lower bounds, but rather aim to better understand what class of problems it is applicable to. 
    more » « less
  2. null (Ed.)
    We consider the problem of finding nearly optimal solutions of optimization problems with random objective functions. Such problems arise widely in the theory of random graphs, theoretical computer science, and statistical physics. Two concrete problems we consider are (a) optimizing the Hamiltonian of a spherical or Ising p-spin glass model, and (b) finding a large independent set in a sparse Erdos-Renyi graph. Two families of algorithms are considered: (a) low-degree polynomials of the input-a general framework that captures methods such as approximate message passing and local algorithms on sparse graphs, among others; and (b) the Langevin dynamics algorithm, a canonical Monte Carlo analogue of the gradient descent algorithm (applicable only for the spherical p-spin glass Hamiltonian). We show that neither family of algorithms can produce nearly optimal solutions with high probability. Our proof uses the fact that both models are known to exhibit a variant of the overlap gap property (OGP) of near-optimal solutions. Specifically, for both models, every two solutions whose objective values are above a certain threshold are either close or far from each other. The crux of our proof is the stability of both algorithms: a small perturbation of the input induces a small perturbation of the output. By an interpolation argument, such a stable algorithm cannot overcome the OGP barrier. The stability of the Langevin dynamics is an immediate consequence of the well-posedness of stochastic differential equations. The stability of low-degree polynomials is established using concepts from Gaussian and Boolean Fourier analysis, including noise sensitivity, hypercontractivity, and total influence. 
    more » « less
  3. Abernethy, Jacob; Agarwal, Shivani (Ed.)
    We study a variant of the sparse PCA (principal component analysis) problem in the “hard” regime, where the inference task is possible yet no polynomial-time algorithm is known to exist. Prior work, based on the low-degree likelihood ratio, has conjectured a precise expression for the best possible (sub-exponential) runtime throughout the hard regime. Following instead a statistical physics inspired point of view, we show bounds on the depth of free energy wells for various Gibbs measures naturally associated to the problem. These free energy wells imply hitting time lower bounds that corroborate the low-degree conjecture: we show that a class of natural MCMC (Markov chain Monte Carlo) methods (with worst-case initialization) cannot solve sparse PCA with less than the conjectured runtime. These lower bounds apply to a wide range of values for two tuning parameters: temperature and sparsity misparametrization. Finally, we prove that the Overlap Gap Property (OGP), a structural property that implies failure of certain local search algorithms, holds in a significant part of the hard regime. 
    more » « less
  4. Given a random n × n symmetric matrix 𝑾 drawn from the Gaussian orthogonal ensemble (GOE), we consider the problem of certifying an upper bound on the maximum value of the quadratic form 𝒙^⊤ 𝑾 𝒙 over all vectors 𝒙 in a constraint set 𝒮 ⊂ ℝⁿ. For a certain class of normalized constraint sets we show that, conditional on a certain complexity-theoretic conjecture, no polynomial-time algorithm can certify a better upper bound than the largest eigenvalue of 𝑾. A notable special case included in our results is the hypercube 𝒮 = {±1/√n}ⁿ, which corresponds to the problem of certifying bounds on the Hamiltonian of the Sherrington-Kirkpatrick spin glass model from statistical physics. Our results suggest a striking gap between optimization and certification for this problem. Our proof proceeds in two steps. First, we give a reduction from the detection problem in the negatively-spiked Wishart model to the above certification problem. We then give evidence that this Wishart detection problem is computationally hard below the classical spectral threshold, by showing that no low-degree polynomial can (in expectation) distinguish the spiked and unspiked models. This method for predicting computational thresholds was proposed in a sequence of recent works on the sum-of-squares hierarchy, and is conjectured to be correct for a large class of problems. Our proof can be seen as constructing a distribution over symmetric matrices that appears computationally indistinguishable from the GOE, yet is supported on matrices whose maximum quadratic form over 𝒙 ∈ 𝒮 is much larger than that of a GOE matrix. 
    more » « less
  5. For the tensor PCA (principal component analysis) problem, we propose a new hierarchy of increasingly powerful algorithms with increasing runtime. Our hierarchy is analogous to the sumof-squares (SOS) hierarchy but is instead inspired by statistical physics and related algorithms such as belief propagation and AMP (approximate message passing). Our level-ℓ algorithm can be thought of as a linearized message-passing algorithm that keeps track of ℓ-wise dependencies among the hidden variables. Specifically, our algorithms are spectral methods based on the Kikuchi Hessian, which generalizes the well-studied Bethe Hessian to the higher-order Kikuchi free energies. It is known that AMP, the flagship algorithm of statistical physics, has substantially worse performance than SOS for tensor PCA. In this work we ‘redeem’ the statistical physics approach by showing that our hierarchy gives a polynomial-time algorithm matching the performance of SOS. Our hierarchy also yields a continuum of subexponential-time algorithms, and we prove that these achieve the same (conjecturally optimal) tradeoff between runtime and statistical power as SOS. Our proofs are much simpler than prior work, and also apply to the related problem of refuting random k-XOR formulas. The results we present here apply to tensor PCA for tensors of all orders, and to k-XOR when k is even. Our methods suggest a new avenue for systematically obtaining optimal algorithms for Bayesian inference problems, and our results constitute a step toward unifying the statistical physics and sum-of-squares approaches to algorithm design. 
    more » « less
  6. Anandkumar, Animashree (Ed.)
    Neural networks provide a rich class of high-dimensional, non-convex optimization problems. Despite their non-convexity, gradient-descent methods often successfully optimize these models. This has motivated a recent spur in research attempting to characterize properties of their loss surface that may explain such success. In this paper, we address this phenomenon by studying a key topological property of the loss: the presence or absence of spurious valleys, defined as connected components of sub-level sets that do not include a global minimum. Focusing on a class of two-layer neural networks defined by smooth (but generally non-linear) activation functions, we identify a notion of intrinsic dimension and show that it provides necessary and sufficient conditions for the absence of spurious valleys. More concretely, finite intrinsic dimension guarantees that for sufficiently overparametrised models no spurious valleys exist, independently of the data distribution. Conversely, infinite intrinsic dimension implies that spurious valleys do exist for certain data distributions, independently of model overparametrisation. Besides these positive and negative results, we show that, although spurious valleys may exist in general, they are confined to low risk levels and avoided with high probability on overparametrised models. 
    more » « less
  7. null (Ed.)
    Equiangular tight frames (ETFs) may be used to construct examples of feasible points for semidefinite programs arising in sum-of-squares (SOS) optimization. We show how generalizing the calculations in a recent work of the authors’ that explored this connection also yields new bounds on the sparsity of (both real and complex) ETFs. One corollary shows that Steiner ETFs corresponding to finite projective planes are optimally sparse in the sense of achieving tightness in a matrix inequality controlling overlaps between sparsity patterns of distinct rows of the synthesis matrix. We also formulate several natural open problems concerning further generalizations of our technique. 
    more » « less