Characterizing the privacy degradation over compositions, i.e., privacy accounting, is a fundamental topic in differential privacy (DP) with many applications to differentially private machine learning and federated learning. We propose a unification of recent advances (Renyi DP, privacy profiles, 𝑓-DP and the PLD formalism) via the characteristic function (𝜙-function) of a certain dominating privacy loss random variable. We show that our approach allows natural adaptive composition like Renyi DP, provides exactly tight privacy accounting like PLD, and can be (often losslessly) converted to privacy profile and 𝑓-DP, thus providing (𝜖,𝛿)-DP guarantees and interpretable tradeoff functions. Algorithmically, we propose an analytical Fourier accountant that represents the complex logarithm of 𝜙-functions symbolically and uses Gaussian quadrature for numerical computation. On several popular DP mechanisms and their subsampled counterparts, we demonstrate the flexibility and tightness of our approach in theory and experiments.
more »
« less
Privacy Profiles for Private Selection
Private selection mechanisms (e.g., Report Noisy Max, Sparse Vector) are fundamental primitives of differentially private (DP) data analysis with wide applications to private query release, voting, and hyperparameter tuning. Recent work (Liu and Talwar, 2019; Papernot and Steinke, 2022) has made significant progress in both generalizing private selection mechanisms and tightening their privacy analysis using modern numerical privacy accounting tools, e.g., Rényi DP. But Rényi DP is known to be lossy when (ϵ,δ)-DP is ultimately needed, and there is a trend to close the gap by directly handling privacy profiles, i.e., δ as a function of ϵ or its equivalent dual form known as f-DPs. In this paper, we work out an easy-to-use recipe that bounds the privacy profiles of ReportNoisyMax and PrivateTuning using the privacy profiles of the base algorithms they corral. Numerically, our approach improves over the RDP-based accounting in all regimes of interest and leads to substantial benefits in end-to-end private learning experiments. Our analysis also suggests new distributions, e.g., binomial distribution for randomizing the number of rounds that leads to more substantial improvements in certain regimes.
more »
« less
- Award ID(s):
- 2048091
- PAR ID:
- 10565970
- Publisher / Repository:
- ICML 2024
- Date Published:
- Journal Name:
- Proceedings of Machine Learning Research
- ISSN:
- 2640-3498
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
f-DP has recently been proposed as a generalization of differential privacy allowing a lossless analysis of composition, post-processing, and privacy amplification via subsampling. In the setting of f-DP, we propose the concept of a canonical noise distribution (CND), the first mechanism designed for an arbitrary f-DP guarantee. The notion of CND captures whether an additive privacy mechanism perfectly matches the privacy guarantee of a given f. We prove that a CND always exists, and give a construction that produces a CND for any f. We show that private hypothesis tests are intimately related to CNDs, allowing for the release of private p-values at no additional privacy cost as well as the construction of uniformly most powerful (UMP) tests for binary data, within the general f-DP framework. We apply our techniques to the problem of difference of proportions testing, and construct a UMP unbiased (UMPU) "semi-private" test which upper bounds the performance of any f-DP test. Using this as a benchmark we propose a private test, based on the inversion of characteristic functions, which allows for optimal inference for the two population parameters and is nearly as powerful as the semi-private UMPU. When specialized to the case of (ϵ,0)-DP, we show empirically that our proposed test is more powerful than any (ϵ/sqrt(2))-DP test and has more accurate type I errors than the classic normal approximation test.more » « less
-
Evans, Robin J.; Shpitser, Ilya (Ed.)Most existing approaches of differentially private (DP) machine learning focus on private training. Despite its many advantages, private training lacks the flexibility in adapting to incremental changes to the training dataset such as deletion requests from exercising GDPR’s right to be forgotten. We revisit a long-forgotten alternative, known as private prediction, and propose a new algorithm named Individual Kernelized Nearest Neighbor (Ind-KNN). Ind-KNN is easily updatable over dataset changes and it allows precise control of the Rényi DP at an individual user level — a user’s privacy loss is measured by the exact amount of her contribution to predictions; and a user is removed if her prescribed privacy budget runs out. Our results show that Ind-KNN consistently improves the accuracy over existing private prediction methods for a wide range of epsilon on four vision and language tasks. We also illustrate several cases under which Ind-KNN is preferable over private training with NoisySGD.more » « less
-
The central question studied in this paper is Rényi Differential Privacy (RDP) guarantees for general discrete local randomizers in the shuffle privacy model. In the shuffle model, each of the 𝑛 clients randomizes its response using a local differentially private (LDP) mechanism and the untrusted server only receives a random permutation (shuffle) of the client responses without association to each client. The principal result in this paper is the first direct RDP bounds for general discrete local randomization in the shuffle pri- vacy model, and we develop new analysis techniques for deriving our results which could be of independent interest. In applications, such an RDP guarantee is most useful when we use it for composing several private interactions. We numerically demonstrate that, for important regimes, with composition our bound yields an improve- ment in privacy guarantee by a factor of 8× over the state-of-the-art approximate Differential Privacy (DP) guarantee (with standard composition) for shuffle models. Moreover, combining with Pois- son subsampling, our result leads to at least 10× improvement over subsampled approximate DP with standard composition.more » « less
-
Differential privacy is a de facto standard for statistical computations over databases that contain private data. Its main and rather surprising strength is to guarantee individual privacy and yet allow for accurate statistical results. Thanks to its mathematical definition, differential privacy is also a natural target for formal analysis. A broad line of work develops and uses logical methods for proving privacy. A more recent and complementary line of work uses statistical methods for finding privacy violations. Although both lines of work are practically successful, they elide the fundamental question of decidability. This paper studies the decidability of differential privacy. We first establish that checking differential privacy is undecidable even if one restricts to programs having a single Boolean input and a single Boolean output. Then, we define a non-trivial class of programs and provide a decision procedure for checking the differential privacy of a program in this class. Our procedure takes as input a program P parametrized by a privacy budget ϵ and either establishes the differential privacy for all possible values of ϵ or generates a counter-example. In addition, our procedure works for both to ϵ-differential privacy and (ϵ, δ)-differential privacy. Technically, the decision procedure is based on a novel and judicious encoding of the semantics of programs in our class into a decidable fragment of the first-order theory of the reals with exponentiation. We implement our procedure and use it for (dis)proving privacy bounds for many well-known examples, including randomized response, histogram, report noisy max and sparse vector.more » « less
An official website of the United States government

