Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

We consider the question of Gaussian mean testing, a fundamental task in highdimensional distribution testing and signal processing, subject to adversarial corruptions of the samples. We focus on the relative power of different adversaries, and show that, in contrast to the common wisdom in robust statistics, there exists a strict separation between adaptive adversaries (strong contamination) and oblivious ones (weak contamination) for this task. Specifically, we resolve both the informationtheoretic and computational landscapes for robust mean testing. In the exponentialtime setting, we establish the tight sample complexity of testing N(0,I) against N(αv,I), where ∥v∥2=1, with an εfraction of adversarial corruptions, to be Θ~(max(d−−√α2,dε3α4,min(d2/3ε2/3α8/3,dεα2))), while the complexity against adaptive adversaries is Θ~(max(d−−√α2,dε2α4)), which is strictly worse for a large range of vanishing ε,α. To the best of our knowledge, ours is the first separation in sample complexity between the strong and weak contamination models. In the polynomialtime setting, we close a gap in the literature by providing a polynomialtime algorithm against adaptive adversaries achieving the above sample complexity Θ~(max(d−−√/α2,dε2/α4)), and a lowdegree lower bound (which complements an existing reduction from planted clique) suggesting that all efficient algorithms require this many samples, even in the obliviousadversary setting.more » « lessFree, publiclyaccessible full text available November 9, 2024

We consider the question of Gaussian mean testing, a fundamental task in highdimensional distribution testing and signal processing, subject to adversarial corruptions of the samples. We focus on the relative power of different adversaries, and show that, in contrast to the common wisdom in robust statistics, there exists a strict separation between adaptive adversaries (strong contamination) and oblivious ones (weak contamination) for this task. Specifically, we resolve both the informationtheoretic and computational landscapes for robust mean testing. In the exponentialtime setting, we establish the tight sample complexity of testing N(0,I) against N(αv,I), where ∥v∥2=1, with an εfraction of adversarial corruptions, to be Θ~(max(d√α2,dε3α4,min(d2/3ε2/3α8/3,dεα2))) while the complexity against adaptive adversaries is Θ~(max(d√α2,dε2α4)) which is strictly worse for a large range of vanishing ε,α. To the best of our knowledge, ours is the first separation in sample complexity between the strong and weak contamination models. In the polynomialtime setting, we close a gap in the literature by providing a polynomialtime algorithm against adaptive adversaries achieving the above sample complexity Θ~(max(d−−√/α2,dε2/α4)), and a lowdegree lower bound (which complements an existing reduction from planted clique) suggesting that all efficient algorithms require this many samples, even in the obliviousadversary setting.more » « lessFree, publiclyaccessible full text available November 1, 2024

We present a fast, differentially private algorithm for highdimensional covarianceaware mean estimation with nearly optimal sample complexity. Only exponentialtime estimators were previously known to achieve this guarantee. Given n samples from a (sub)Gaussian distribution with unknown mean μ and covariance Σ, our (ε,δ)differentially private estimator produces μ~ such that ∥μ−μ~∥Σ≤α as long as n≳dα2+dlog1/δ√αε+dlog1/δε. The Mahalanobis error metric ∥μ−μ^∥Σ measures the distance between μ^ and μ relative to Σ; it characterizes the error of the sample mean. Our algorithm runs in time O~(ndω−1+nd/ε), where ω<2.38 is the matrix multiplication exponent. We adapt an exponentialtime approach of Brown, Gaboardi, Smith, Ullman, and Zakynthinou (2021), giving efficient variants of stable mean and covariance estimation subroutines that also improve the sample complexity to the nearly optimal bound above. Our stable covariance estimator can be turned to private covariance estimation for unrestricted subgaussian distributions. With n≳d3/2 samples, our estimate is accurate in spectral norm. This is the first such algorithm using n=o(d2) samples, answering an open question posed by Alabi et al. (2022). With n≳d2 samples, our estimate is accurate in Frobenius norm. This leads to a fast, nearly optimal algorithm for private learning of unrestricted Gaussian distributions in TV distance. Duchi, Haque, and Kuditipudi (2023) obtained similar results independently and concurrently.more » « lessFree, publiclyaccessible full text available July 15, 2024

We present a fast, differentially private algorithm for highdimensional covarianceaware mean estimation with nearly optimal sample complexity. Only exponentialtime estimators were previously known to achieve this guarantee. Given n samples from a (sub)Gaussian distribution with unknown mean μ and covariance Σ, our (ϵ,δ)differentially private estimator produces μ~ such that ∥μ−μ~∥Σ≤α as long as n≳dα2+dlog1/δ√αϵ+dlog1/δϵ. The Mahalanobis error metric ∥μ−μ^∥Σ measures the distance between μ^ and μ relative to Σ; it characterizes the error of the sample mean. Our algorithm runs in time O~(ndω−1+nd/\eps), where ω<2.38 is the matrix multiplication exponent.We adapt an exponentialtime approach of Brown, Gaboardi, Smith, Ullman, and Zakynthinou (2021), giving efficient variants of stable mean and covariance estimation subroutines that also improve the sample complexity to the nearly optimal bound above.Our stable covariance estimator can be turned to private covariance estimation for unrestricted subgaussian distributions. With n≳d3/2 samples, our estimate is accurate in spectral norm. This is the first such algorithm using n=o(d2) samples, answering an open question posed by Alabi et al. (2022). With n≳d2 samples, our estimate is accurate in Frobenius norm. This leads to a fast, nearly optimal algorithm for private learning of unrestricted Gaussian distributions in TV distance.Duchi, Haque, and Kuditipudi (2023) obtained similar results independently and concurrently.more » « lessFree, publiclyaccessible full text available July 12, 2024

We study the relationship between adversarial robustness and differential privacy in highdimensional algorithmic statistics. We give the first blackbox reduction from privacy to robustness which can produce private estimators with optimal tradeoffs among sample complexity, accuracy, and privacy for a wide range of fundamental highdimensional parameter estimation problems, including mean and covariance estimation. We show that this reduction can be implemented in polynomial time in some important special cases. In particular, using nearlyoptimal polynomialtime robust estimators for the mean and covariance of highdimensional Gaussians which are based on the SumofSquares method, we design the first polynomialtime private estimators for these problems with nearlyoptimal samplesaccuracyprivacy tradeoffs. Our algorithms are also robust to a constant fraction of adversariallycorrupted samples.more » « lessFree, publiclyaccessible full text available June 23, 2024

We study the relationship between adversarial robustness and differential privacy in highdimensional algorithmic statistics. We give the first blackbox reduction from privacy to robustness which can produce private estimators with optimal tradeoffs among sample complexity, accuracy, and privacy for a wide range of fundamental highdimensional parameter estimation problems, including mean and covariance estimation. We show that this reduction can be implemented in polynomial time in some important special cases. In particular, using nearlyoptimal polynomialtime robust estimators for the mean and covariance of highdimensional Gaussians which are based on the SumofSquares method, we design the first polynomialtime private estimators for these problems with nearlyoptimal samplesaccuracyprivacy tradeoffs. Our algorithms are also robust to a nearly optimal fraction of adversariallycorrupted samples.more » « less