Most cloud service providers offer limited data privacy guarantees, discouraging clients from using them for managing their sensitive data. Cloud providers may use servers with Trusted Execution Environments (TEEs) to protect outsourced data, while supporting remote querying. However, TEEs may leak access patterns and allow communication volume attacks, enabling an honest-but-curious cloud provider to learn sensitive information. Oblivious algorithms can be used to completely hide data access patterns, but their high overhead could render them impractical. To alleviate the latter, the notion of Differential Obliviousness (DO) has been recently proposed. DO applies differential privacy (DP) on access patterns while hiding the communication volume of intermediate and final results; it does so by trading some level of privacy for efficiency. We present Doquet:DifferentiallyOblivious Range and JoinQueries with Private Data Structures, a framework for DO outsourced database systems. Doquet is the first approach that supports private data structures, indices, selection, foreign key join, many-to-many join, and their composition select-join in arealisticTEE setting, even when the accesses to the private memory can be eavesdropped on by the adversary. We prove that the algorithms in Doquet satisfy differential obliviousness. Furthermore, we implemented Doquet and tested it on a machine having a second generation of Intel SGX (TEE); the results show that Doquet offers up to an order of magnitude speedup in comparison with other fully oblivious and differentially oblivious approaches.
more »
« less
A Theory of Composition for Differential Obliviousness
Differential obliviousness (DO) is a privacy notion which guarantees that the access patterns of a program satisfies differential privacy. Differential obliviousness was studied in a sequence of recent works as a relaxation of full obliviousness. Earlier works showed that DO not only allows us to circumvent the logarithmic-overhead barrier of fully oblivious algorithms, in many cases, it also allows us to achieve polynomial speedup over full obliviousness, since it avoids “padding to the worst-case” behavior of fully oblivious algorithms. Despite the promises of differential obliviousness (DO), a significant barrier that hinders its broad application is the lack of composability. In particular, when we apply one DO algorithm to the output of another DO algorithm, the composed algorithm may no longer be DO (with reasonable parameters). Specifically, the outputs of the first DO algorithm on two neighboring inputs may no longer be neighboring, and thus we cannot directly benefit from the DO guarantee of the second algorithm. In this work, we are the first to explore a theory of composition for differentially oblivious algorithms. We propose a refinement of the DO notion called (ε, δ)-neighbor-preserving-DO, or (ε,δ)-NPDO for short, and we prove that our new notion indeed provides nice compositional guarantees. In this way, the algorithm designer can easily track the privacy loss when composing multiple DO algorithms. We give several example applications to showcase the power and expressiveness of our new NPDO notion. One of these examples is a result of independent interest: we use the com- positional framework to prove an optimal privacy amplification theorem for the differentially oblivious shuffle model. In other words, we show that for a class of distributed differentially private mechanisms in the shuffle-model, one can replace the perfectly secure shuffler with a DO shuffler, and nonetheless enjoy almost the same privacy amplification enabled by a shuffler.
more »
« less
- Award ID(s):
- 1704788
- PAR ID:
- 10464147
- Date Published:
- Journal Name:
- Eurocrypt 2023
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Iterative algorithms, like gradient descent, are common tools for solving a variety of problems, such as model fitting. For this reason, there is interest in creating differentially private versions of them. However, their conversion to differentially private algorithms is often naive. For instance, a fixed number of iterations are chosen, the privacy budget is split evenly among them, and at each iteration, parameters are updated with a noisy gradient. In this paper, we show that gradient-based algorithms can be improved by a more careful allocation of privacy budget per iteration. Intuitively, at the beginning of the optimization, gradients are expected to be large, so that they do not need to be measured as accurately. However, as the parameters approach their optimal values, the gradients decrease and hence need to be measured more accurately. We add a basic line-search capability that helps the algorithm decide when more accurate gradient measurements are necessary. Our gradient descent algorithm works with the recently introduced zCDP version of differential privacy. It outperforms prior algorithms for model fitting and is competitive with the state-of-the-art for $(ε,δ)$-differential privacy, a strictly weaker definition than zCDP.more » « less
-
Posterior sampling, i.e., exponential mechanism to sample from the posterior distribution, provides ε-pure differential privacy (DP) guarantees and does not suffer from potentially unbounded privacy breach introduced by (ε,δ)-approximate DP. In practice, however, one needs to apply approximate sampling methods such as Markov chain Monte Carlo (MCMC), thus re-introducing the unappealing δ-approximation error into the privacy guarantees. To bridge this gap, we propose the Approximate SAample Perturbation (abbr. ASAP) algorithm which perturbs an MCMC sample with noise proportional to its Wasserstein-infinity (W∞) distance from a reference distribution that satisfies pure DP or pure Gaussian DP (i.e., δ=0). We then leverage a Metropolis-Hastings algorithm to generate the sample and prove that the algorithm converges in W∞ distance. We show that by combining our new techniques with a localization step, we obtain the first nearly linear-time algorithm that achieves the optimal rates in the DP-ERM problem with strongly convex and smooth losses.more » « less
-
Abstract In the past decade, differential privacy has seen remarkable success as a rigorous and practical formalization of data privacy. This privacy definition and its divergence based relaxations, however, have several acknowledged weaknesses, either in handling composition of private algorithms or in analysing important primitives like privacy amplification by subsampling. Inspired by the hypothesis testing formulation of privacy, this paper proposes a new relaxation of differential privacy, which we term ‘f-differential privacy’ (f-DP). This notion of privacy has a number of appealing properties and, in particular, avoids difficulties associated with divergence based relaxations. First, f-DP faithfully preserves the hypothesis testing interpretation of differential privacy, thereby making the privacy guarantees easily interpretable. In addition, f-DP allows for lossless reasoning about composition in an algebraic fashion. Moreover, we provide a powerful technique to import existing results proven for the original differential privacy definition to f-DP and, as an application of this technique, obtain a simple and easy-to-interpret theorem of privacy amplification by subsampling for f-DP. In addition to the above findings, we introduce a canonical single-parameter family of privacy notions within the f-DP class that is referred to as ‘Gaussian differential privacy’ (GDP), defined based on hypothesis testing of two shifted Gaussian distributions. GDP is the focal privacy definition among the family of f-DP guarantees due to a central limit theorem for differential privacy that we prove. More precisely, the privacy guarantees of any hypothesis testing based definition of privacy (including the original differential privacy definition) converges to GDP in the limit under composition. We also prove a Berry–Esseen style version of the central limit theorem, which gives a computationally inexpensive tool for tractably analysing the exact composition of private algorithms. Taken together, this collection of attractive properties render f-DP a mathematically coherent, analytically tractable and versatile framework for private data analysis. Finally, we demonstrate the use of the tools we develop by giving an improved analysis of the privacy guarantees of noisy stochastic gradient descent.more » « less
-
Meka, Raghu (Ed.)Differential privacy and sublinear algorithms are both rapidly emerging algorithmic themes in times of big data analysis. Although recent works have shown the existence of differentially private sublinear algorithms for many problems including graph parameter estimation and clustering, little is known regarding hardness results on these algorithms. In this paper, we initiate the study of lower bounds for problems that aim for both differentially-private and sublinear-time algorithms. Our main result is the incompatibility of both the desiderata in the general case. In particular, we prove that a simple problem based on one-way marginals yields both a differentially-private algorithm, as well as a sublinear-time algorithm, but does not admit a "strictly" sublinear-time algorithm that is also differentially private.more » « less
An official website of the United States government

