skip to main content

Title: No Statistical-Computational Gap in Spiked Matrix Models with Generative Network Priors
We provide a non-asymptotic analysis of the spiked Wishart and Wigner matrix models with a generative neural network prior. Spiked random matrices have the form of a rank-one signal plus noise and have been used as models for high dimensional Principal Component Analysis (PCA), community detection and synchronization over groups. Depending on the prior imposed on the spike, these models can display a statistical-computational gap between the information theoretically optimal reconstruction error that can be achieved with unbounded computational resources and the sub-optimal performances of currently known polynomial time algorithms. These gaps are believed to be fundamental, as in the emblematic case of Sparse PCA. In stark contrast to such cases, we show that there is no statistical-computational gap under a generative network prior, in which the spike lies on the range of a generative neural network. Specifically, we analyze a gradient descent method for minimizing a nonlinear least squares objective over the range of an expansive-Gaussian neural network and show that it can recover in polynomial time an estimate of the underlying spike with a rate-optimal sample complexity and dependence on the noise level.
Authors:
; ;
Award ID(s):
1848087
Publication Date:
NSF-PAR ID:
10252108
Journal Name:
Entropy
Volume:
23
Issue:
1
Page Range or eLocation-ID:
115
ISSN:
1099-4300
Sponsoring Org:
National Science Foundation
More Like this
  1. Clustering is a fundamental primitive in unsupervised learning which gives rise to a rich class of computationally-challenging inference tasks. In this work, we focus on the canonical task of clustering d-dimensional Gaussian mixtures with unknown (and possibly degenerate) covariance. Recent works (Ghosh et al. ’20; Mao, Wein ’21; Davis, Diaz, Wang ’21) have established lower bounds against the class of low-degree polynomial methods and the sum-of-squares (SoS) hierarchy for recovering certain hidden structures planted in Gaussian clustering instances. Prior work on many similar inference tasks portends that such lower bounds strongly suggest the presence of an inherent statistical-to-computational gap for clustering, that is, a parameter regime where the clustering task is statistically possible but no polynomial-time algorithm succeeds. One special case of the clustering task we consider is equivalent to the problem of finding a planted hypercube vector in an otherwise random subspace. We show that, perhaps surprisingly, this particular clustering model does not exhibit a statistical-to-computational gap, despite the aforementioned low-degree and SoS lower bounds. To achieve this, we give an algorithm based on Lenstra–Lenstra–Lovász lattice basis reduction which achieves the statistically-optimal sample complexity of d + 1 samples. This result extends the class of problems whose conjectured statistical-to-computationalmore »gaps can be “closed” by “brittle” polynomial-time algorithms, highlighting the crucial but subtle role of noise in the onset of statistical-to-computational gaps.« less
  2. Given a random n × n symmetric matrix 𝑾 drawn from the Gaussian orthogonal ensemble (GOE), we consider the problem of certifying an upper bound on the maximum value of the quadratic form 𝒙^⊤ 𝑾 𝒙 over all vectors 𝒙 in a constraint set 𝒮 ⊂ ℝⁿ. For a certain class of normalized constraint sets we show that, conditional on a certain complexity-theoretic conjecture, no polynomial-time algorithm can certify a better upper bound than the largest eigenvalue of 𝑾. A notable special case included in our results is the hypercube 𝒮 = {±1/√n}ⁿ, which corresponds to the problem of certifying bounds on the Hamiltonian of the Sherrington-Kirkpatrick spin glass model from statistical physics. Our results suggest a striking gap between optimization and certification for this problem. Our proof proceeds in two steps. First, we give a reduction from the detection problem in the negatively-spiked Wishart model to the above certification problem. We then give evidence that this Wishart detection problem is computationally hard below the classical spectral threshold, by showing that no low-degree polynomial can (in expectation) distinguish the spiked and unspiked models. This method for predicting computational thresholds was proposed in a sequence of recent works on the sum-of-squaresmore »hierarchy, and is conjectured to be correct for a large class of problems. Our proof can be seen as constructing a distribution over symmetric matrices that appears computationally indistinguishable from the GOE, yet is supported on matrices whose maximum quadratic form over 𝒙 ∈ 𝒮 is much larger than that of a GOE matrix.« less
  3. Abernethy, Jacob ; Agarwal, Shivani (Ed.)
    We study a variant of the sparse PCA (principal component analysis) problem in the “hard” regime, where the inference task is possible yet no polynomial-time algorithm is known to exist. Prior work, based on the low-degree likelihood ratio, has conjectured a precise expression for the best possible (sub-exponential) runtime throughout the hard regime. Following instead a statistical physics inspired point of view, we show bounds on the depth of free energy wells for various Gibbs measures naturally associated to the problem. These free energy wells imply hitting time lower bounds that corroborate the low-degree conjecture: we show that a class of natural MCMC (Markov chain Monte Carlo) methods (with worst-case initialization) cannot solve sparse PCA with less than the conjectured runtime. These lower bounds apply to a wide range of values for two tuning parameters: temperature and sparsity misparametrization. Finally, we prove that the Overlap Gap Property (OGP), a structural property that implies failure of certain local search algorithms, holds in a significant part of the hard regime.
  4. Table of Contents: Foreword by the CI 2016 Workshop Chairs …………………………………vi Foreword by the CI 2016 Steering Committee ..…………………………..…..viii List of Organizing Committee ………………………….……....x List of Registered Participants .………………………….……..xi Acknowledgement of Sponsors ……………………………..…xiv Hackathon and Workshop Agenda .………………………………..xv Hackathon Summary .………………………….…..xviii Invited talks - abstracts and links to presentations ………………………………..xxi Proceedings: 34 short research papers ……………………………….. 1-135 Papers 1. BAYESIAN MODELS FOR CLIMATE RECONSTRUCTION FROM POLLEN RECORDS ..................................... 1 Lasse Holmström, Liisa Ilvonen, Heikki Seppä, Siim Veski 2. ON INFORMATION CRITERIA FOR DYNAMIC SPATIO-TEMPORAL CLUSTERING ..................................... 5 Ethan D. Schaeffer, Jeremy M. Testa, Yulia R. Gel, Vyacheslav Lyubchich 3. DETECTING MULTIVARIATE BIOSPHERE EXTREMES ..................................... 9 Yanira Guanche García, Erik Rodner, Milan Flach, Sebastian Sippel, Miguel Mahecha, Joachim Denzler 4. SPATIO-TEMPORAL GENERATIVE MODELS FOR RAINFALL OVER INDIA ..................................... 13 Adway Mitra 5. A NONPARAMETRIC COPULA BASED BIAS CORRECTION METHOD FOR STATISTICAL DOWNSCALING ..................................... 17 Yi Li, Adam Ding, Jennifer Dy 6. DETECTING AND PREDICTING BEAUTIFUL SUNSETS USING SOCIAL MEDIA DATA ..................................... 21 Emma Pierson 7. OCEANTEA: EXPLORING OCEAN-DERIVED CLIMATE DATA USING MICROSERVICES ..................................... 25 Arne N. Johanson, Sascha Flögel, Wolf-Christian Dullo, Wilhelm Hasselbring 8. IMPROVED ANALYSIS OF EARTH SYSTEM MODELS AND OBSERVATIONS USING SIMPLE CLIMATE MODELS ..................................... 29 Balu Nadiga, Nathanmore »Urban 9. SYNERGY AND ANALOGY BETWEEN 15 YEARS OF MICROWAVE SST AND ALONG-TRACK SSH ..................................... 33 Pierre Tandeo, Aitor Atencia, Cristina Gonzalez-Haro 10. PREDICTING EXECUTION TIME OF CLIMATE-DRIVEN ECOLOGICAL FORECASTING MODELS ..................................... 37 Scott Farley and John W. Williams 11. SPATIOTEMPORAL ANALYSIS OF SEASONAL PRECIPITATION OVER US USING CO-CLUSTERING ..................................... 41 Mohammad Gorji–Sefidmazgi, Clayton T. Morrison 12. PREDICTION OF EXTREME RAINFALL USING HYBRID CONVOLUTIONAL-LONG SHORT TERM MEMORY NETWORKS ..................................... 45 Sulagna Gope, Sudeshna Sarkar, Pabitra Mitra 13. SPATIOTEMPORAL PATTERN EXTRACTION WITH DATA-DRIVEN KOOPMAN OPERATORS FOR CONVECTIVELY COUPLED EQUATORIAL WAVES ..................................... 49 Joanna Slawinska, Dimitrios Giannakis 14. COVARIANCE STRUCTURE ANALYSIS OF CLIMATE MODEL OUTPUT ..................................... 53 Chintan Dalal, Doug Nychka, Claudia Tebaldi 15. SIMPLE AND EFFICIENT TENSOR REGRESSION FOR SPATIOTEMPORAL FORECASTING ..................................... 57 Rose Yu, Yan Liu 16. TRACKING OF TROPICAL INTRASEASONAL CONVECTIVE ANOMALIES ..................................... 61 Bohar Singh, James L. Kinter 17. ANALYSIS OF AMAZON DROUGHTS USING SUPERVISED KERNEL PRINCIPAL COMPONENT ANALYSIS ..................................... 65 Carlos H. R. Lima, Amir AghaKouchak 18. A BAYESIAN PREDICTIVE ANALYSIS OF DAILY PRECIPITATION DATA ..................................... 69 Sai K. Popuri, Nagaraj K. Neerchal, Amita Mehta 19. INCORPORATING PRIOR KNOWLEDGE IN SPATIO-TEMPORAL NEURAL NETWORK FOR CLIMATIC DATA ..................................... 73 Arthur Pajot, Ali Ziat, Ludovic Denoyer, Patrick Gallinari 20. DIMENSIONALITY-REDUCTION OF CLIMATE DATA USING DEEP AUTOENCODERS ..................................... 77 Juan A. Saenz, Nicholas Lubbers, Nathan M. Urban 21. MAPPING PLANTATION IN INDONESIA ..................................... 81 Xiaowei Jia, Ankush Khandelwal, James Gerber, Kimberly Carlson, Paul West, Vipin Kumar 22. FROM CLIMATE DATA TO A WEIGHTED NETWORK BETWEEN FUNCTIONAL DOMAINS ..................................... 85 Ilias Fountalis, Annalisa Bracco, Bistra Dilkina, Constantine Dovrolis 23. EMPLOYING SOFTWARE ENGINEERING PRINCIPLES TO ENHANCE MANAGEMENT OF CLIMATOLOGICAL DATASETS FOR CORAL REEF ANALYSIS ..................................... 89 Mark Jenne, M.M. Dalkilic, Claudia Johnson 24. Profiler Guided Manual Optimization for Accelerating Cholesky Decomposition on R Environment ..................................... 93 V.B. Ramakrishnaiah, R.P. Kumar, J. Paige, D. Hammerling, D. Nychka 25. GLOBAL MONITORING OF SURFACE WATER EXTENT DYNAMICS USING SATELLITE DATA ..................................... 97 Anuj Karpatne, Ankush Khandelwal and Vipin Kumar 26. TOWARD QUANTIFYING TROPICAL CYCLONE RISK USING DIAGNOSTIC INDICES .................................... 101 Erica M. Staehling and Ryan E. Truchelut 27. OPTIMAL TROPICAL CYCLONE INTENSITY ESTIMATES WITH UNCERTAINTY FROM BEST TRACK DATA .................................... 105 Suz Tolwinski-Ward 28. EXTREME WEATHER PATTERN DETECTION USING DEEP CONVOLUTIONAL NEURAL NETWORK .................................... 109 Yunjie Liu, Evan Racah, Prabhat, Amir Khosrowshahi, David Lavers, Kenneth Kunkel, Michael Wehner, William Collins 29. INFORMATION TRANSFER ACROSS TEMPORAL SCALES IN ATMOSPHERIC DYNAMICS .................................... 113 Nikola Jajcay and Milan Paluš 30. Identifying precipitation regimes in China using model-based clustering of spatial functional data .................................... 117 Haozhe Zhang, Zhengyuan Zhu, Shuiqing Yin 31. RELATIONAL RECURRENT NEURAL NETWORKS FOR SPATIOTEMPORAL INTERPOLATION FROM MULTI-RESOLUTION CLIMATE DATA .................................... 121 Guangyu Li, Yan Liu 32. OBJECTIVE SELECTION OF ENSEMBLE BOUNDARY CONDITIONS FOR CLIMATE DOWNSCALING .................................... 124 Andrew Rhines, Naomi Goldenson 33. LONG-LEAD PREDICTION OF EXTREME PRECIPITATION CLUSTER VIA A SPATIO-TEMPORAL CONVOLUTIONAL NEURAL NETWORK .................................... 128 Yong Zhuang, Wei Ding 34. MULTIPLE INSTANCE LEARNING FOR BURNED AREA MAPPING USING MULTI –TEMPORAL REFLECTANCE DATA .................................... 132 Guruprasad Nayak, Varun Mithal, Vipin Kumar« less
  5. Raynal, Ann M. ; Ranney, Kenneth I. (Ed.)
    Most research in technologies for the Deaf community have focused on translation using either video or wearable devices. Sensor-augmented gloves have been reported to yield higher gesture recognition rates than camera-based systems; however, they cannot capture information expressed through head and body movement. Gloves are also intrusive and inhibit users in their pursuit of normal daily life, while cameras can raise concerns over privacy and are ineffective in the dark. In contrast, RF sensors are non-contact, non-invasive and do not reveal private information even if hacked. Although RF sensors are unable to measure facial expressions or hand shapes, which would be required for complete translation, this paper aims to exploit near real-time ASL recognition using RF sensors for the design of smart Deaf spaces. In this way, we hope to enable the Deaf community to benefit from advances in technologies that could generate tangible improvements in their quality of life. More specifically, this paper investigates near real-time implementation of machine learning and deep learning architectures for the purpose of sequential ASL signing recognition. We utilize a 60 GHz RF sensor which transmits a frequency modulation continuous wave (FMWC waveform). RF sensors can acquire a unique source of information that ismore »inaccessible to optical or wearable devices: namely, a visual representation of the kinematic patterns of motion via the micro-Doppler signature. Micro-Doppler refers to frequency modulations that appear about the central Doppler shift, which are caused by rotational or vibrational motions that deviate from principle translational motion. In prior work, we showed that fractal complexity computed from RF data could be used to discriminate signing from daily activities and that RF data could reveal linguistic properties, such as coarticulation. We have also shown that machine learning can be used to discriminate with 99% accuracy the signing of native Deaf ASL users from that of copysigning (or imitation signing) by hearing individuals. Therefore, imitation signing data is not effective for directly training deep models. But, adversarial learning can be used to transform imitation signing to resemble native signing, or, alternatively, physics-aware generative models can be used to synthesize ASL micro-Doppler signatures for training deep neural networks. With such approaches, we have achieved over 90% recognition accuracy of 20 ASL signs. In natural environments, however, near real-time implementations of classification algorithms are required, as well as an ability to process data streams in a continuous and sequential fashion. In this work, we focus on extensions of our prior work towards this aim, and compare the efficacy of various approaches for embedding deep neural networks (DNNs) on platforms such as a Raspberry Pi or Jetson board. We examine methods for optimizing the size and computational complexity of DNNs for embedded micro-Doppler analysis, methods for network compression, and their resulting sequential ASL recognition performance.« less