Practical and pervasive needs for robustness and privacy in algorithms have inspired the design of
online adversarial and differentially private learning algorithms. The primary quantity that characterizes
learnability in these settings is the Littlestone dimension of the class of hypotheses [Alon et al., 2019,
BenDavid et al., 2009]. This characterization is often interpreted as an impossibility result because
classes such as linear thresholds and neural networks have infinite Littlestone dimension. In this paper,
we apply the framework of smoothed analysis [Spielman and Teng, 2004], in which adversarially chosen
inputs are perturbed slightly by nature. We show that fundamentally stronger regret and error guarantees are possible with smoothed adversaries than with worstcase adversaries. In particular, we obtain
regret and privacy error bounds that depend only on the VC dimension and the bracketing number of a
hypothesis class, and on the magnitudes of the perturbations.
more »
« less
An Equivalence Between Private Classification and Online Prediction
We prove that every concept class with finite Littlestone dimension can be learned by an (approximate) differentiallyprivate algorithm. This answers an open question of Alon et al. (STOC
2019) who proved the converse statement (this question was also asked by Neel et al. (FOCS
2019)). Together these two results yield an equivalence between online learnability and private
PAC learnability.
We introduce a new notion of algorithmic stability called “global stability” which is essential
to our proof and may be of independent interest. We also discuss an application of our results
to boosting the privacy and accuracy parameters of differentiallyprivate learners.
more »
« less
 Award ID(s):
 1947889
 NSFPAR ID:
 10205755
 Date Published:
 Journal Name:
 61st Annual IEEE Symposium on Foundations of Computer Science
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


We present a fast, differentially private algorithm for highdimensional covarianceaware mean estimation with nearly optimal sample complexity. Only exponentialtime estimators were previously known to achieve this guarantee. Given n samples from a (sub)Gaussian distribution with unknown mean μ and covariance Σ, our (ε,δ)differentially private estimator produces μ~ such that ∥μ−μ~∥Σ≤α as long as n≳dα2+dlog1/δ√αε+dlog1/δε. The Mahalanobis error metric ∥μ−μ^∥Σ measures the distance between μ^ and μ relative to Σ; it characterizes the error of the sample mean. Our algorithm runs in time O~(ndω−1+nd/ε), where ω<2.38 is the matrix multiplication exponent. We adapt an exponentialtime approach of Brown, Gaboardi, Smith, Ullman, and Zakynthinou (2021), giving efficient variants of stable mean and covariance estimation subroutines that also improve the sample complexity to the nearly optimal bound above. Our stable covariance estimator can be turned to private covariance estimation for unrestricted subgaussian distributions. With n≳d3/2 samples, our estimate is accurate in spectral norm. This is the first such algorithm using n=o(d2) samples, answering an open question posed by Alabi et al. (2022). With n≳d2 samples, our estimate is accurate in Frobenius norm. This leads to a fast, nearly optimal algorithm for private learning of unrestricted Gaussian distributions in TV distance. Duchi, Haque, and Kuditipudi (2023) obtained similar results independently and concurrently.more » « less

We present a fast, differentially private algorithm for highdimensional covarianceaware mean estimation with nearly optimal sample complexity. Only exponentialtime estimators were previously known to achieve this guarantee. Given n samples from a (sub)Gaussian distribution with unknown mean μ and covariance Σ, our (ϵ,δ)differentially private estimator produces μ~ such that ∥μ−μ~∥Σ≤α as long as n≳dα2+dlog1/δ√αϵ+dlog1/δϵ. The Mahalanobis error metric ∥μ−μ^∥Σ measures the distance between μ^ and μ relative to Σ; it characterizes the error of the sample mean. Our algorithm runs in time O~(ndω−1+nd/\eps), where ω<2.38 is the matrix multiplication exponent.We adapt an exponentialtime approach of Brown, Gaboardi, Smith, Ullman, and Zakynthinou (2021), giving efficient variants of stable mean and covariance estimation subroutines that also improve the sample complexity to the nearly optimal bound above.Our stable covariance estimator can be turned to private covariance estimation for unrestricted subgaussian distributions. With n≳d3/2 samples, our estimate is accurate in spectral norm. This is the first such algorithm using n=o(d2) samples, answering an open question posed by Alabi et al. (2022). With n≳d2 samples, our estimate is accurate in Frobenius norm. This leads to a fast, nearly optimal algorithm for private learning of unrestricted Gaussian distributions in TV distance.Duchi, Haque, and Kuditipudi (2023) obtained similar results independently and concurrently.more » « less

Perhaps the single most important use case for differential privacy is to privately answer numerical queries, which is usually achieved by adding noise to the answer vector. The central question is, therefore, to understand which noise distribution optimizes the privacyaccuracy tradeoff, especially when the dimension of the answer vector is high. Accordingly, an extensive literature has been dedicated to the question and the upper and lower bounds have been successfully matched up to constant factors (Bun et al.,2018; Steinke & Ullman, 2017). In this paper, we take a novel approach to address this important optimality question. We first demonstrate an intriguing central limit theorem phenomenon in the highdimensional regime. More precisely, we prove that a mechanism is approximately Gaussian Differentially Private (Dong et al., 2021) if the added noise satisfies certain conditions. In particular, densities proportional to exp(x_p^alpha), where x_p is the standard l_pnorm, satisfies the conditions. Taking this perspective, we make use of the CramerRao inequality and show an "uncertainty principle"style result: the product of privacy parameter and the l_2loss of the mechanism is lower bounded by the dimension. Furthermore, the Gaussian mechanism achieves the constantsharp optimal privacyaccuracy tradeoff among all such noises. Our findings are corroborated by numerical experiments.more » « less