The goal of compressed sensing is to estimate a high dimensional vector from an underdetermined system of noisy linear equations. In analogy to classical compressed sensing, here we assume a generative model as a prior, that is, we assume the vector is represented by a deep generative model G:Rk→Rn. Classical recovery approaches such as empirical risk minimization (ERM) are guaranteed to succeed when the measurement matrix is subGaussian. However, when the measurement matrix and measurements are heavytailed or have outliers, recovery may fail dramatically. In this paper we propose an algorithm inspired by the MedianofMeans (MOM). Our algorithm guarantees recovery for heavytailed data, even in the presence of outliers. Theoretically, our results show our novel MOMbased algorithm enjoys the same sample complexity guarantees as ERM under subGaussian assumptions. Our experiments validate both aspects of our claims: other algorithms are indeed fragile and fail under heavytailed and/or corrupted data, while our approach exhibits the predicted robustness.
ConstantExpansion Suffices for Compressed Sensing with Generative Priors. In the
Generative neural networks have been empirically found very promising in providing effective structural priors for compressed sensing, since they can be trained to span lowdimensional data manifolds in highdimensional signal spaces. Despite the nonconvexity of the resulting optimization problem, it has also been shown theoretically that, for neural networks with random Gaussian weights, a signal in the range of the network can be efficiently, approximately recovered from a few noisy measurements. However, a major bottleneck of these theoretical guarantees is a network expansivity condition: that each layer of the neural network must be larger than the previous by a logarithmic factor. Our main contribution is to break this strong expansivity assumption, showing that constant expansivity suffices to get efficient recovery algorithms, besides it also being informationtheoretically necessary. To overcome the theoretical bottleneck in existing approaches we prove a novel uniform concentration theorem for random functions that might not be Lipschitz but satisfy a relaxed notion which we call "pseudoLipschitzness." Using this theorem we can show that a matrix concentration inequality known as the Weight Distribution Condition (WDC), which was previously only known to hold for Gaussian matrices with logarithmic aspect ratio, in fact holds for constant aspect ratios too. Since more »
 Award ID(s):
 1741137
 Publication Date:
 NSFPAR ID:
 10228234
 Journal Name:
 34th Annual Conference on Neural Information Processing Systems (NeurIPS), NeurIPS 2020
 Sponsoring Org:
 National Science Foundation
More Like this


The plugandplay priors (PnP) and regularization by denoising (RED) methods have become widely used for solving inverse problems by leveraging pretrained deep denoisers as image priors. While the empirical imaging performance and the theoretical convergence properties of these algorithms have been widely investigated, their recovery properties have not previously been theoretically analyzed. We address this gap by showing how to establish theoretical recovery guarantees for PnP/RED by assuming that the solution of these methods lies near the fixedpoints of a deep neural network. We also present numerical results comparing the recovery performance of PnP/RED in compressive sensing against that of recent compressive sensing algorithms based on generative models. Our numerical results suggest that PnP with a pretrained artifact removal network provides significantly better results compared to the existing stateoftheart methods.

We provide a nonasymptotic analysis of the spiked Wishart and Wigner matrix models with a generative neural network prior. Spiked random matrices have the form of a rankone signal plus noise and have been used as models for high dimensional Principal Component Analysis (PCA), community detection and synchronization over groups. Depending on the prior imposed on the spike, these models can display a statisticalcomputational gap between the information theoretically optimal reconstruction error that can be achieved with unbounded computational resources and the suboptimal performances of currently known polynomial time algorithms. These gaps are believed to be fundamental, as in the emblematic case of Sparse PCA. In stark contrast to such cases, we show that there is no statisticalcomputational gap under a generative network prior, in which the spike lies on the range of a generative neural network. Specifically, we analyze a gradient descent method for minimizing a nonlinear least squares objective over the range of an expansiveGaussian neural network and show that it can recover in polynomial time an estimate of the underlying spike with a rateoptimal sample complexity and dependence on the noise level.

In secondorder optimization, a potential bottleneck can be computing the Hessian matrix of the optimized function at every iteration. Randomized sketching has emerged as a powerful technique for constructing estimates of the Hessian which can be used to perform approximate Newton steps. This involves multiplication by a random sketching matrix, which introduces a tradeoff between the computational cost of sketching and the convergence rate of the optimization algorithm. A theoretically desirable but practically much too expensive choice is to use a dense Gaussian sketching matrix, which produces unbiased estimates of the exact Newton step and which offers strong problemindependent convergence guarantees. We show that the Gaussian sketching matrix can be drastically sparsified, significantly reducing the computational cost of sketching, without substantially affecting its convergence properties. This approach, called Newton LESS, is based on a recently introduced sketching technique: LEverage Score Sparsified (LESS) embeddings. We prove that NewtonLESS enjoys nearly the same problemindependent local convergence rate as Gaussian embeddings, not just up to constant factors but even down to lower order terms, for a large class of optimization tasks. In particular, this leads to a new stateoftheart convergence result for an iterative least squares solver. Finally, we extend LESS embeddings tomore »

This paper focuses on the mutual information and minimum meansquared error (MMSE) as a function a matrix valued signaltonoise ratio (SNR) for a linear Gaussian channel with arbitrary input distribution. As shown by Lamarca, the mutualinformation is a concave function of a positive semi definite matrix, which we call the matrix SNR. This implies that the mapping from the matrix SNR to the MMSE matrix is decreasing monotone. Building upon these functional properties, we start to construct a unifying framework that provides a bridge between classical informationtheoretic inequalities, such as the entropy power inequality, and interpolation techniques used in statistical physics and random matrix theory. This framework provides new insight into the structure of phase transitions in coding theory and compressed sensing. In particular, it is shown that the parallel combination of linear channels with freelyindependent matrices can be characterized succinctly via free convolution.