We study the problem of locally private mean estimation of highdimensional vectors in the Euclidean ball. Existing algorithms for this problem either incur suboptimal error or have high communication and/or runtime complexity. We propose a new algorithmic framework, ProjUnit, for private mean estimation that yields algorithms that are computationally efficient, have low communication complexity, and incur optimal error up to a 1+o(1)factor. Our framework is deceptively simple: each randomizer projects its input to a random lowdimensional subspace, normalizes the result, and then runs an optimal algorithm such as PrivUnitG in the lowerdimensional space. In addition, we show that, by appropriately correlating the random projection matrices across devices, we can achieve fast server runtime. We mathematically analyze the error of the algorithm in terms of properties of the random projections, and study two instantiations. Lastly, our experiments for private mean estimation and private federated learning demonstrate that our algorithms empirically obtain nearly the same utility as optimal ones while having significantly lower communication and computational cost.
more »
« less
Private Frequency Estimation via Projective Geometry
In this work, we propose a new algorithm ProjectiveGeometryResponse (PGR) for locally differentially private (LDP) frequency estimation. For universe size of k and with n users, our epsLDP algorithm has communication cost ceil(log_2 k) and computation cost O(n + k\exp(eps) log k) for the server to approximately reconstruct the frequency histogram, while achieve optimal privacyutility tradeoff. In many practical settings this is a significant improvement over the O (n+k^2) computation cost that is achieved by the recent PIRAPPOR algorithm (Feldman and Talwar; 2021). Our empirical evaluation shows a speedup of over 50x over PIRAPPOR while using approximately 75x less memory. In addition, the running time of our algorithm is comparable to that of HadamardResponse (Acharya, Sun, and Zhang; 2019) and RecursiveHadamardResponse (Chen, Kairouz, and Ozgur; 2020) which have significantly worse reconstruction error. The error of our algorithm essentially matches that of the communication and timeinefficient but utilityoptimal SubsetSelection (SS) algorithm (Ye and Barg; 2017). Our new algorithm is based on using Projective Planes over a finite field to define a small collection of sets that are close to being pairwise independent and a dynamic programming algorithm for approximate histogram reconstruction for the server.
more »
« less
 Award ID(s):
 1909314
 NSFPAR ID:
 10385077
 Date Published:
 Journal Name:
 Proceedings of Machine Learning Research
 Volume:
 162
 ISSN:
 26403498
 Page Range / eLocation ID:
 64186433
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


We study the problem of locally private mean estimation of highdimensional vectors in the Euclidean ball. Existing algorithms for this problem either incur suboptimal error or have high communication and/or runtime complexity. We propose a new algorithmic framework, ProjUnit, for private mean estimation that yields algorithms that are computationally efficient, have low communication complexity, and incur optimal error up to a 1 + o(1)factor. Our framework is deceptively simple: each randomizer projects its input to a random lowdimensional subspace, normalizes the result, and then runs an optimal algorithm such as PrivUnitG in the lowerdimensional space. In addition, we show that, by appropriately correlating the random projection matrices across devices, we can achieve fast server runtime. We mathematically analyze the error of the algorithm in terms of properties of the random projections, and study two instantiations. Lastly, our experiments for private mean estimation and private federated learning demonstrate that our algorithms empirically obtain nearly the same utility as optimal ones while having significantly lower communication and computational cost.more » « less

This paper studies Byzantine reliable broadcast (BRB) under asynchronous networks, and improves the stateoftheart protocols from the following aspects. Nearoptimal communication cost: We propose two new BRB protocols for n nodes and input message M that has communication cost O(nM +n^2 log n), which is nearoptimal due to the lower bound of Ω(nM +n^2). The first BRB protocol assumes threshold signature but is easy to understand, while the second BRB protocol is errorfree but less intuitive. Improved computation: We propose a new construction that improves the computation cost of the stateoftheart BRB by avoiding the expensive online error correction on the input message, while achieving the same communication cost. Balanced communication: We propose a technique named balanced multicast that can balance the communication cost for BRB protocols where the broadcaster needs to multicast the message M while other nodes only needs to multicast coded fragments of size O(M/n + log n). The balanced multicast technique can be applied to many existing BRB protocols as well as all our new constructions in this paper, and can make every node incur about the same communication cost. Finally, we present a lower bound to show the near optimality of our protocol in terms of communication cost at each node.more » « less

We provide faster algorithms for approximating the optimal transport distance, e.g. earth mover's distance, between two discrete probability distributions on n elements. We present two algorithms that compute couplings between marginal distributions with an expected transportation cost that is within an additive ϵ of optimal in time O(n^2/eps); one algorithm is straightforward to parallelize and implementable in depth O(1/eps). Further, we show that additional improvements on our results must be coupled with breakthroughs in algorithmic graph theory.more » « less

We study the problem of estimating the covariance matrix of a highdimensional distribution when a small constant fraction of the samples can be arbitrarily corrupted. Recent work gave the first polynomial time algorithms for this problem with nearoptimal error guarantees for several natural structured distributions. Our main contribution is to develop faster algorithms for this problem whose running time nearly matches that of computing the empirical covariance. Given N = Ω(d^2/\eps^2) samples from a ddimensional Gaussian distribution, an \epsfraction of which may be arbitrarily corrupted, our algorithm runs in time O(d^{3.26}/ poly(\eps)) and approximates the unknown covariance matrix to optimal error up to a logarithmic factor. Previous robust algorithms with comparable error guarantees all have runtimes Ω(d^{2ω}) when \eps = Ω(1), where ω is the exponent of matrix multiplication. We also provide evidence that improving the running time of our algorithm may require new algorithmic techniques.more » « less