Beyond Catoni: Sharper Rates for Heavy-Tailed and Robust Mean Estimation

Gupta, Shivam; Hopkins, Samuel B; Price, Eric

Citation Details

We study the fundamental problem of estimating the mean of a d-dimensional distribution with covariance Σ≼σ2Id given n samples. When d=1, \cite{catoni} showed an estimator with error (1+o(1))⋅σ2log1δn−−−−−√, with probability 1−δ, matching the Gaussian error rate. For d>1, a natural estimator outputs the center of the minimum enclosing ball of one-dimensional confidence intervals to achieve a 1−δ confidence radius of 2dd+1−−−√⋅σ(dn−−√+2log1δn−−−−−√), incurring a 2dd+1−−−√-factor loss over the Gaussian rate. When the dn−−√ term dominates by a log1δ−−−−√ factor, \cite{lee2022optimal-highdim} showed an improved estimator matching the Gaussian rate. This raises a natural question: Is the 2dd+1−−−√ loss \emph{necessary} when the 2log1δn−−−−−√ term dominates? We show that the answer is \emph{no} -- we construct an estimator that improves over the above naive estimator by a constant factor. We also consider robust estimation, where an adversary is allowed to corrupt an ϵ-fraction of samples arbitrarily: in this case, we show that the above strategy of combining one-dimensional estimates and incurring the 2dd+1−−−√-factor \emph{is} optimal in the infinite-sample limit. more »

Award ID(s):: 2238080

PAR ID:: 10568949

Author(s) / Creator(s):: Gupta, Shivam; Hopkins, Samuel B; Price, Eric

Publisher / Repository:: Conference on Learning Theory. The 37th Annual Conference on Learning Theory (COLT 2024)

Date Published:: 2024-06-30

Format(s):: Medium: X

Location:: Edmonton, Canada

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this