 Award ID(s):
 1951384
 NSFPAR ID:
 10491119
 Publisher / Repository:
 NeurIPS
 Date Published:
 Journal Name:
 Thirtyseventh Annual Conference on Neural Information Processing Systems (NeurIPS)
 Format(s):
 Medium: X
 Location:
 New Orleans, LA
 Sponsoring Org:
 National Science Foundation
More Like this

We study the problem of locally private mean estimation of highdimensional vectors in the Euclidean ball. Existing algorithms for this problem either incur suboptimal error or have high communication and/or runtime complexity. We propose a new algorithmic framework, ProjUnit, for private mean estimation that yields algorithms that are computationally efficient, have low communication complexity, and incur optimal error up to a 1 + o(1)factor. Our framework is deceptively simple: each randomizer projects its input to a random lowdimensional subspace, normalizes the result, and then runs an optimal algorithm such as PrivUnitG in the lowerdimensional space. In addition, we show that, by appropriately correlating the random projection matrices across devices, we can achieve fast server runtime. We mathematically analyze the error of the algorithm in terms of properties of the random projections, and study two instantiations. Lastly, our experiments for private mean estimation and private federated learning demonstrate that our algorithms empirically obtain nearly the same utility as optimal ones while having significantly lower communication and computational cost.more » « less

We present a fast, differentially private algorithm for highdimensional covarianceaware mean estimation with nearly optimal sample complexity. Only exponentialtime estimators were previously known to achieve this guarantee. Given n samples from a (sub)Gaussian distribution with unknown mean μ and covariance Σ, our (ε,δ)differentially private estimator produces μ~ such that ∥μ−μ~∥Σ≤α as long as n≳dα2+dlog1/δ√αε+dlog1/δε. The Mahalanobis error metric ∥μ−μ^∥Σ measures the distance between μ^ and μ relative to Σ; it characterizes the error of the sample mean. Our algorithm runs in time O~(ndω−1+nd/ε), where ω<2.38 is the matrix multiplication exponent. We adapt an exponentialtime approach of Brown, Gaboardi, Smith, Ullman, and Zakynthinou (2021), giving efficient variants of stable mean and covariance estimation subroutines that also improve the sample complexity to the nearly optimal bound above. Our stable covariance estimator can be turned to private covariance estimation for unrestricted subgaussian distributions. With n≳d3/2 samples, our estimate is accurate in spectral norm. This is the first such algorithm using n=o(d2) samples, answering an open question posed by Alabi et al. (2022). With n≳d2 samples, our estimate is accurate in Frobenius norm. This leads to a fast, nearly optimal algorithm for private learning of unrestricted Gaussian distributions in TV distance. Duchi, Haque, and Kuditipudi (2023) obtained similar results independently and concurrently.more » « less

We present a fast, differentially private algorithm for highdimensional covarianceaware mean estimation with nearly optimal sample complexity. Only exponentialtime estimators were previously known to achieve this guarantee. Given n samples from a (sub)Gaussian distribution with unknown mean μ and covariance Σ, our (ϵ,δ)differentially private estimator produces μ~ such that ∥μ−μ~∥Σ≤α as long as n≳dα2+dlog1/δ√αϵ+dlog1/δϵ. The Mahalanobis error metric ∥μ−μ^∥Σ measures the distance between μ^ and μ relative to Σ; it characterizes the error of the sample mean. Our algorithm runs in time O~(ndω−1+nd/\eps), where ω<2.38 is the matrix multiplication exponent.We adapt an exponentialtime approach of Brown, Gaboardi, Smith, Ullman, and Zakynthinou (2021), giving efficient variants of stable mean and covariance estimation subroutines that also improve the sample complexity to the nearly optimal bound above.Our stable covariance estimator can be turned to private covariance estimation for unrestricted subgaussian distributions. With n≳d3/2 samples, our estimate is accurate in spectral norm. This is the first such algorithm using n=o(d2) samples, answering an open question posed by Alabi et al. (2022). With n≳d2 samples, our estimate is accurate in Frobenius norm. This leads to a fast, nearly optimal algorithm for private learning of unrestricted Gaussian distributions in TV distance.Duchi, Haque, and Kuditipudi (2023) obtained similar results independently and concurrently.more » « less

We study the fundamental problem of highdimensional mean estimation in a robust model where a constant fraction of the samples are adversarially corrupted. Recent work gave the first polynomial time algorithms for this problem with dimensionindependent error guarantees for several families of structured distributions. In this work, we give the first nearlylinear time algorithms for highdimensional robust mean estimation. Specifically, we focus on distributions with (i) known covariance and subgaussian tails, and (ii) unknown bounded covariance. Given N samples on R^d, an \epsfraction of which may be arbitrarily corrupted, our algorithms run in time eO(Nd)/poly(\eps) and approximate the true mean within the informationtheoretically optimal error, up to constant factors. Previous robust algorithms with comparable error guarantees have running times \Omega(Nd^2), for \eps= O(1) Our algorithms rely on a natural family of SDPs parameterized by our current guess ν for the unknown mean μ. We give a winwin analysis establishing the following: either a nearoptimal solution to the primal SDP yields a good candidate for μ — independent of our current guess ν — or a nearoptimal solution to the dual SDP yields a new guess ν0 whose distance from μ is smaller by a constant factor. We exploit the special structure of the corresponding SDPs to show that they are approximately solvable in nearlylinear time. Our approach is quite general, and we believe it can also be applied to obtain nearlylinear time algorithms for other highdimensional robust learning problems.more » « less

In this paper, we study a sampling problem where a source takes samples from a Wiener process and transmits them through a wireless channel to a remote estimator. Due to channel fading, interference, and potential collisions, the packet transmissions are unreliable and could take random time durations. Our objective is to devise an optimal causal sampling policy that minimizes the longterm average mean square estimation error. This optimal sampling problem is a recursive optimal stopping problem, which is generally quite difficult to solve. However, we prove that the optimal sampling strategy is, in fact, a simple threshold policy where a new sample is taken whenever the instantaneous estimation error exceeds a threshold. This threshold remains a constant value that does not vary over time. By exploring the structure properties of the recursive optimal stopping problem, a lowcomplexity iterative algorithm is developed to compute the optimal threshold. This work generalizes previous research by incorporating both transmission errors and random transmission times into remote estimation. Numerical simulations are provided to compare our optimal policy with the zerowait and ageoptimal policies.