skip to main content


Title: On the Rényi Differential Privacy of the Shuffle Model
The central question studied in this paper is Rényi Differential Privacy (RDP) guarantees for general discrete local randomizers in the shuffle privacy model. In the shuffle model, each of the 𝑛 clients randomizes its response using a local differentially private (LDP) mechanism and the untrusted server only receives a random permutation (shuffle) of the client responses without association to each client. The principal result in this paper is the first direct RDP bounds for general discrete local randomization in the shuffle pri- vacy model, and we develop new analysis techniques for deriving our results which could be of independent interest. In applications, such an RDP guarantee is most useful when we use it for composing several private interactions. We numerically demonstrate that, for important regimes, with composition our bound yields an improve- ment in privacy guarantee by a factor of 8× over the state-of-the-art approximate Differential Privacy (DP) guarantee (with standard composition) for shuffle models. Moreover, combining with Pois- son subsampling, our result leads to at least 10× improvement over subsampled approximate DP with standard composition.  more » « less
Award ID(s):
1740047 2007714
NSF-PAR ID:
10313175
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
ACM Symposium on Computer and Communication Security (CCS)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Differential obliviousness (DO) is a privacy notion which guarantees that the access patterns of a program satisfies differential privacy. Differential obliviousness was studied in a sequence of recent works as a relaxation of full obliviousness. Earlier works showed that DO not only allows us to circumvent the logarithmic-overhead barrier of fully oblivious algorithms, in many cases, it also allows us to achieve polynomial speedup over full obliviousness, since it avoids “padding to the worst-case” behavior of fully oblivious algorithms. Despite the promises of differential obliviousness (DO), a significant barrier that hinders its broad application is the lack of composability. In particular, when we apply one DO algorithm to the output of another DO algorithm, the composed algorithm may no longer be DO (with reasonable parameters). Specifically, the outputs of the first DO algorithm on two neighboring inputs may no longer be neighboring, and thus we cannot directly benefit from the DO guarantee of the second algorithm. In this work, we are the first to explore a theory of composition for differentially oblivious algorithms. We propose a refinement of the DO notion called (ε, δ)-neighbor-preserving-DO, or (ε,δ)-NPDO for short, and we prove that our new notion indeed provides nice compositional guarantees. In this way, the algorithm designer can easily track the privacy loss when composing multiple DO algorithms. We give several example applications to showcase the power and expressiveness of our new NPDO notion. One of these examples is a result of independent interest: we use the com- positional framework to prove an optimal privacy amplification theorem for the differentially oblivious shuffle model. In other words, we show that for a class of distributed differentially private mechanisms in the shuffle-model, one can replace the perfectly secure shuffler with a DO shuffler, and nonetheless enjoy almost the same privacy amplification enabled by a shuffler. 
    more » « less
  2. Differential privacy (DP) is a widely used notion for reasoning about privacy when publishing aggregate data. In this paper, we observe that certain DP mechanisms are amenable to a posteriori privacy analysis that exploits the fact that some outputs leak less information about the input database than others. To exploit this phenomenon, we introduce output differential privacy (ODP) and a new composition experiment, and leverage these new constructs to obtain significant privacy budget savings and improved privacy–utility tradeoffs under composition. All of this comes at no cost in terms of privacy; we do not weaken the privacy guarantee. To demonstrate the applicability of our a posteriori privacy analysis techniques, we analyze two well-known mechanisms: the Sparse Vector Technique and the Propose-Test-Release framework. We then show how our techniques can be used to save privacy budget in more general contexts: when a differentially private iterative mechanism terminates before its maximal number of iterations is reached, and when the output of a DP mechanism provides unsatisfactory utility. Examples of the former include iterative optimization algorithms, whereas examples of the latter include training a machine learning model with a large generalization error. Our techniques can be applied beyond the current paper to refine the analysis of existing DP mechanisms or guide the design of future mechanisms. 
    more » « less
  3. Providing privacy protection has been one of the primary motivations of Federated Learning (FL). Recently, there has been a line of work on incorporating the formal privacy notion of differential privacy with FL. To guarantee the client-level differential privacy in FL algorithms, the clients’ transmitted model updates have to be clipped before adding privacy noise. Such clipping operation is substantially different from its counterpart of gradient clipping in the centralized differentially private SGD and has not been well-understood. In this paper, we first empirically demonstrate that the clipped FedAvg can perform surprisingly well even with substantial data heterogeneity when training neural networks, which is partly because the clients’ updates become similar for several popular deep architectures. Based on this key observation, we provide the convergence analysis of a differential private (DP) FedAvg algorithm and highlight the relationship between clipping bias and the distribution of the clients’ updates. To the best of our knowledge, this is the first work that rigorously investigates theoretical and empirical issues regarding the clipping operation in FL algorithms. 
    more » « less
  4. Federated learning (FL) enables distributed agents to collaboratively learn a centralized model without sharing their raw data with each other. However, data locality does not provide sufficient privacy protection, and it is desirable to facilitate FL with rigorous differential privacy (DP) guarantee. Existing DP mechanisms would introduce random noise with magnitude proportional to the model size, which can be quite large in deep neural networks. In this paper, we propose a new FL framework with sparsification-amplified privacy. Our approach integrates random sparsification with gradient perturbation on each agent to amplify privacy guarantee. Since sparsification would increase the number of communication rounds required to achieve a certain target accuracy, which is unfavorable for DP guarantee, we further introduce acceleration techniques to help reduce the privacy cost. We rigorously analyze the convergence of our approach and utilize Renyi DP to tightly account the end-to-end DP guarantee. Extensive experiments on benchmark datasets validate that our approach outperforms previous differentially-private FL approaches in both privacy guarantee and communication efficiency.

     
    more » « less
  5. Machine learning models are increasingly used in high-stakes decision-making systems. In such applications, a major concern is that these models sometimes discriminate against certain demographic groups such as individuals with certain race, gender, or age. Another major concern in these applications is the violation of the privacy of users. While fair learning algorithms have been developed to mitigate discrimination issues, these algorithms can still leak sensitive information, such as individuals’ health or financial records. Utilizing the notion of differential privacy (DP), prior works aimed at developing learning algorithms that are both private and fair. However, existing algorithms for DP fair learning are either not guaranteed to converge or require full batch of data in each iteration of the algorithm to converge. In this paper, we provide the first stochastic differentially private algorithm for fair learning that is guaranteed to converge. Here, the term “stochastic" refers to the fact that our proposed algorithm converges even when minibatches of data are used at each iteration (i.e. stochastic optimization). Our framework is flexible enough to permit different fairness notions, including demographic parity and equalized odds. In addition, our algorithm can be applied to non-binary classification tasks with multiple (non-binary) sensitive attributes. As a byproduct of our convergence analysis, we provide the first utility guarantee for a DP algorithm for solving nonconvex-strongly concave min-max problems. Our numerical experiments show that the proposed algorithm consistently offers significant performance gains over the state-of-the-art baselines, and can be applied to larger scale problems with non-binary target/sensitive attributes. 
    more » « less