We propose and analyze a new estimator of the covariance matrix that admits strong theoretical guarantees under weak assumptions on the underlying distribution, such as existence of moments of only low order. While estimation of covariance matrices corresponding to subGaussian distributions is wellunderstood, much less in known in the case of heavytailed data. As K. Balasubramanian and M. Yuan write, "data from realworld experiments oftentimes tend to be corrupted with outliers and/or exhibit heavy tails. In such cases, it is not clear that those covariance matrix estimators .. remain optimal" and "what are the other possible strategies to deal withmore »
Robust compressed sensing using generative models
The goal of compressed sensing is to estimate a high dimensional vector from an underdetermined system of noisy linear equations. In analogy to classical compressed sensing, here we assume a generative model as a prior, that is, we assume the vector is represented by a deep generative model G:Rk→Rn.
Classical recovery approaches such as empirical risk minimization (ERM) are guaranteed to succeed when the measurement matrix is subGaussian. However, when the measurement matrix and measurements are heavytailed or have outliers, recovery may fail dramatically. In this paper we propose an algorithm inspired by the MedianofMeans (MOM). Our algorithm guarantees recovery for heavytailed data, even in the presence of outliers. Theoretically, our results show our novel MOMbased algorithm enjoys the same sample complexity guarantees as ERM under subGaussian assumptions. Our experiments validate both aspects of our claims: other algorithms are indeed fragile and fail under heavytailed and/or corrupted data, while our approach exhibits the predicted robustness.
 Award ID(s):
 1763702
 Publication Date:
 NSFPAR ID:
 10285502
 Journal Name:
 Advances in neural information processing systems
 Volume:
 33
 ISSN:
 10495258
 Sponsoring Org:
 National Science Foundation
More Like this


Generative neural networks have been empirically found very promising in providing effective structural priors for compressed sensing, since they can be trained to span lowdimensional data manifolds in highdimensional signal spaces. Despite the nonconvexity of the resulting optimization problem, it has also been shown theoretically that, for neural networks with random Gaussian weights, a signal in the range of the network can be efficiently, approximately recovered from a few noisy measurements. However, a major bottleneck of these theoretical guarantees is a network expansivity condition: that each layer of the neural network must be larger than the previous by a logarithmicmore »

Abstract: We consider the problem of estimating the covariance structure of a random vector $Y\in \mathbb R^d$ from a sample $Y_1,\ldots,Y_n$. We are interested in the situation when d is large compared to n but the covariance matrix $\Sigma$ of interest has (exactly or approximately) low rank. We assume that the given sample is (a) $\epsilon$adversarially corrupted, meaning that $\epsilon$ fraction of the observations could have been replaced by arbitrary vectors, or that (b) the sample is i.i.d. but the underlying distribution is heavytailed, meaning that the norm of Y possesses only 4 finite moments. We propose an estimator thatmore »

Largescale panel data is ubiquitous in many modern data science applications. Conventional panel data analysis methods fail to address the new challenges, like individual impacts of covariates, endogeneity, embedded lowdimensional structure, and heavytailed errors, arising from the innovation of data collection platforms on which applications operate. In response to these challenges, this paper studies largescale panel data with an interactive effects model. This model takes into account the individual impacts of covariates on each spatial node and removes the exogenous condition by allowing latent factors to affect both covariates and errors. Besides, we waive the subGaussian assumption and allow themore »

The gradient noise (GN) in the stochastic gradient descent (SGD) algorithm is often considered to be Gaussian in the large data regime by assuming that the classical central limit theorem (CLT) kicks in. This assumption is often made for mathematical convenience, since it enables SGD to be analyzed as a stochastic differential equation (SDE) driven by a Brownian motion. We argue that the Gaussianity assumption might fail to hold in deep learning settings and hence render the Brownian motionbased analyses inappropriate. Inspired by nonGaussian natural phenomena, we consider the GN in a more general context and invoke the generalized CLTmore »