Abstract: We consider the problem of estimating the covariance structure of a random vector $Y\in \mathbb R^d$ from a sample $Y_1,\ldots,Y_n$. We are interested in the situation when d is large compared to n but the covariance matrix $\Sigma$ of interest has (exactly or approximately) low rank. We assume that the given sample is (a) $\epsilon$adversarially corrupted, meaning that $\epsilon$ fraction of the observations could have been replaced by arbitrary vectors, or that (b) the sample is i.i.d. but the underlying distribution is heavytailed, meaning that the norm of Y possesses only 4 finite moments. We propose an estimator that is adaptive to the potential lowrank structure of the covariance matrix as well as to the proportion of contaminated data, and admits tight deviation guarantees despite rather weak assumptions on the underlying distribution. Finally, we discuss the algorithms that allow to approximate the proposed estimator in a numerically efficient way.
Robust Modifications of Ustatistics and Applications to Covariance Estimation Problems
Let Y be a ddimensional random vector with unknown mean μ and covariance matrix Σ. This paper is motivated by the problem of designing an estimator of Σ that admits exponential deviation bounds in the operator norm under minimal assumptions on the underlying distribution, such as existence of only 4th moments of the coordinates of Y. To address this problem, we propose robust modifications of the operatorvalued Ustatistics, obtain nonasymptotic guarantees for their performance, and demonstrate the implications of these results to the covariance estimation problem under various structural assumptions.
 Award ID(s):
 1712956
 Publication Date:
 NSFPAR ID:
 10096967
 Journal Name:
 Preprint
 ISSN:
 16511794
 Sponsoring Org:
 National Science Foundation
More Like this


We develop online graph learning algorithms from streaming network data. Our goal is to track the (possibly) timevarying network topology, and affect memory and computational savings by processing the data onthefly as they are acquired. The setup entails observations modeled as stationary graph signals generated by local diffusion dynamics on the unknown network. Moreover, we may have a priori information on the presence or absence of a few edges as in the link prediction problem. The stationarity assumption implies that the observations’ covariance matrix and the socalled graph shift operator (GSO—a matrix encoding the graph topology) commute under mild requirements. This motivates formulating the topology inference task as an inverse problem, whereby one searches for a sparse GSO that is structurally admissible and approximately commutes with the observations’ empirical covariance matrix. For streaming data, said covariance can be updated recursively, and we show online proximal gradient iterations can be brought to bear to efficiently track the timevarying solution of the inverse problem with quantifiable guarantees. Specifically, we derive conditions under which the GSO recovery cost is strongly convex and use this property to prove that the online algorithm converges to within a neighborhood of the optimal timevarying batch solution. Numericalmore »

We leverage proximal gradient iterations to develop an online graph learning algorithm from streaming network data. Our goal is to track the (possibly) timevarying network topology, and effect memory and computational savings by processing the data onthefly as they are acquired. The setup entails observations modeled as stationary graph signals generated by local diffusion dynamics on the unknown network. Moreover, we may have a priori information on the presence or absence of a few edges as in the link prediction problem. The stationarity assumption implies that the observations' covariance matrix and the socalled graph shift operator (GSO  a matrix encoding the graph topology) commute under mild requirements. This motivates formulating the topology inference task as an inverse problem, whereby one searches for a (e.g., sparse) GSO that is structurally admissible and approximately commutes with the observations' empirical covariance matrix. For streaming data said covariance can be updated recursively, and we show online proximal gradient iterations can be brought to bear to efficiently track the timevarying solution of the inverse problem with quantifiable guarantees. Specifically, we derive conditions under which the GSO recovery cost is strongly convex and use this property to prove that the online algorithm converges to withinmore »

Distributionally robust optimization (DRO) is a powerful tool for decision making under uncertainty. It is particularly appealing because of its ability to leverage existing data. However, many practical problems call for decision making with some auxiliary information, and DRO in the context of conditional distributions is not straightforward. We propose a conditional kernel distributionally robust optimiza tion (CKDRO) method that enables robust decision making under conditional distributions through kernel DRO and the conditional mean operator in the reproducing kernel Hilbert space (RKHS). In particular, we consider problems where there is a correlation between the unknown variable y and an auxiliary observable variable x. Given past data of the two variables and a queried auxiliary variable, CKDRO represents the conditional distribution P(yx) as the conditional mean operator in the RKHS space and quantifies the ambiguity set in the RKHS as well, which depends on the size of the dataset as well as the query point. To justify the use of RKHS, we demonstrate that the ambiguity set defined in RKHS can be viewed as a ball under a metric that is similar to the Wasserstein metric. The DRO is then dualized and solved via a finite dimensional convex program. The proposedmore »

We provide an efficient algorithm for the classical problem, going back to Galton, Pearson, and Fisher, of estimating, with arbitrary accuracy the parameters of a multivariate normal distribution from truncated samples. Truncated samples from a dvariate normal N(μ,Σ) means a samples is only revealed if it falls in some subset S⊆Rd; otherwise the samples are hidden and their count in proportion to the revealed samples is also hidden. We show that the mean μ and covariance matrix Σ can be estimated with arbitrary accuracy in polynomialtime, as long as we have oracle access to S, and S has nontrivial measure under the unknown dvariate normal distribution. Additionally we show that without oracle access to S, any nontrivial estimation is impossible.