skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Robust and efficient mean estimation: an approach based on the properties of self-normalized sums
Let X be a random variable with unknown mean and finite variance. We present a new estimator of the mean of X that is robust with respect to the possible presence of outliers in the sample, provides tight sub-Gaussian deviation guarantees without any additional assumptions on the shape or tails of the distribution, and moreover is asymptotically efficient. This is the first estimator that provably combines all these qualities in one package. Our construction is inspired by robustness properties possessed by the self-normalized sums. Theoretical findings are supplemented by numerical simulations highlighting strong performance of the proposed estimator in comparison with previously known techniques.  more » « less
Award ID(s):
1712956 1908905
PAR ID:
10204545
Author(s) / Creator(s):
;
Publisher / Repository:
Institute of Mathematical Statistics and Bernoulli Society
Date Published:
Journal Name:
Electronic Journal of Statistics
Volume:
15
Issue:
2
ISSN:
1935-7524
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider the statistical connection between the quantized representation of a high dimensional signal X using a random spherical code and the observation of X under an additive white Gaussian noise (AWGN). We show that given X, the conditional Wasserstein distance between its bitrate-R quantized version and its observation under AWGN of signal-to-noise ratio 2^{2R - 1} is sub-linear in the problem dimension. We then utilize this fact to connect the mean squared error (MSE) attained by an estimator based on an AWGN-corrupted version of X to the MSE attained by the same estimator when fed with its bitrate-R quantized version. 
    more » « less
  2. We present a sample- and time-efficient differentially private algorithm for ordinary least squares, with error that depends linearly on the dimension and is independent of the condition number of X⊤X, where X is the design matrix. All prior private algorithms for this task require either d3/2 examples, error growing polynomially with the condition number, or exponential time. Our near-optimal accuracy guarantee holds for any dataset with bounded statistical leverage and bounded residuals. Technically, we build on the approach of Brown et al. (2023) for private mean estimation, adding scaled noise to a carefully designed stable nonprivate estimator of the empirical regression vector. 
    more » « less
  3. In this paper, we propose differentially private algorithms for robust (multivariate) mean estimation and inference under heavy-tailed distributions, with a focus on Gaussian differential privacy. First, we provide a comprehensive analysis of the Huber mean estimator with increasing dimensions, including non-asymptotic deviation bound, Bahadur representation, and (uniform) Gaussian approximations. Secondly, we privatize the Huber mean estimator via noisy gradient descent, which is proven to achieve near-optimal statistical guarantees. The key is to characterize quantitatively the trade-off between statistical accuracy, degree of robustness and privacy level, governed by a carefully chosen robustification parameter. Finally, we construct private confidence intervals for the proposed estimator by incorporating a private and robust covariance estimator. Our findings are demonstrated by simulation studies. 
    more » « less
  4. This paper provides a general derivative identity for the conditional mean estimator of an arbitrary vector signal in Gaussian noise with an arbitrary covariance matrix. This new identity is used to recover and generalize many known identities in the literature and derive some new identities. For example, a new identity is discovered, which shows that an arbitrary higher-order conditional moment is completely determined by the first conditional moment.Several applications of the identities are shown. For instance, by using one of the identities, a simple proof of the uniqueness of the conditional mean estimator as a function of the distribution of the signal is shown. Moreover, one of the identities is used to extend the notion of empirical Bayes to higher-order conditional moments. Specifically, based on a random sample of noisy observations, a consistent estimator for a conditional expectation of any order is derived. 
    more » « less
  5. This article studies estimation of a stationary autocovariance structure in the presence of an unknown number of mean shifts. Here, a Yule–Walker moment estimator for the autoregressive parameters in a dependent time series contaminated by mean shift changepoints is proposed and studied. The estimator is based on first order differences of the series and is proven consistent and asymptotically normal when the number of changepoints m and the series length N satisfy 𝑚/𝑁→0 as 𝑁→∞. 
    more » « less