skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Deciding Differential Privacy of Online Algorithms with Multiple Variables
We consider the problem of checking the differential privacy of online randomized algorithms that process a stream of inputs and produce outputs corresponding to each input. This paper generalizes an automaton model called DiP automata [10] to describe such algorithms by allowing multiple real-valued storage variables. A DiP automaton is a parametric automaton whose behavior depends on the privacy budget ∈. An automaton A will be said to be differentially private if, for some D, the automaton is D∈-differentially private for all values of ∈ > 0. We identify a precise characterization of the class of all differentially private DiP automata. We show that the problem of determining if a given DiP automaton belongs to this class is PSPACE-complete. Our PSPACE algorithm also computes a value for D when the given automaton is differentially private. The algorithm has been implemented, and experiments demonstrating its effectiveness are presented.  more » « less
Award ID(s):
1900924
PAR ID:
10559313
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
ACM CCS '23: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security
Date Published:
ISBN:
9798400700507
Page Range / eLocation ID:
1761 to 1775
Format(s):
Medium: X
Location:
Copenhagen Denmark
Sponsoring Org:
National Science Foundation
More Like this
  1. Given a data set of size n in d'-dimensional Euclidean space, the k-means problem asks for a set of k points (called centers) such that the sum of the l_2^2-distances between the data points and the set of centers is minimized. Previous work on this problem in the local differential privacy setting shows how to achieve multiplicative approximation factors arbitrarily close to optimal, but suffers high additive error. The additive error has also been seen to be an issue in implementations of differentially private k-means clustering algorithms in both the central and local settings. In this work, we introduce a new locally private k-means clustering algorithm that achieves near-optimal additive error whilst retaining constant multiplicative approximation factors and round complexity. Concretely, given any c>sqrt(2), our algorithm achieves O(k^(1 + O(1/(2c^2-1))) * sqrt(d' n) * log d' * poly log n) additive error with an O(c^2) multiplicative approximation factor. 
    more » « less
  2. null (Ed.)
    Differential privacy is a formal, mathematical def- inition of data privacy that has gained traction in academia, industry, and government. The task of correctly constructing differentially private algorithms is non-trivial, and mistakes have been made in foundational algorithms. Currently, there is no automated support for converting an existing, non-private program into a differentially private version. In this paper, we propose a technique for automatically learning an accurate and differentially private version of a given non-private program. We show how to solve this difficult program synthesis problem via a combination of techniques: carefully picking representative example inputs, reducing the problem to continuous optimization, and mapping the results back to symbolic expressions. We demonstrate that our approach is able to learn foundational al- gorithms from the differential privacy literature and significantly outperforms natural program synthesis baselines. 
    more » « less
  3. Meka, Raghu (Ed.)
    Differential privacy and sublinear algorithms are both rapidly emerging algorithmic themes in times of big data analysis. Although recent works have shown the existence of differentially private sublinear algorithms for many problems including graph parameter estimation and clustering, little is known regarding hardness results on these algorithms. In this paper, we initiate the study of lower bounds for problems that aim for both differentially-private and sublinear-time algorithms. Our main result is the incompatibility of both the desiderata in the general case. In particular, we prove that a simple problem based on one-way marginals yields both a differentially-private algorithm, as well as a sublinear-time algorithm, but does not admit a "strictly" sublinear-time algorithm that is also differentially private. 
    more » « less
  4. null (Ed.)
    Differential privacy has become a de facto standard for releasing data in a privacy-preserving way. Creating a differentially private algorithm is a process that often starts with a noise-free (nonprivate) algorithm. The designer then decides where to add noise, and how much of it to add. This can be a non-trivial process – if not done carefully, the algorithm might either violate differential privacy or have low utility. In this paper, we present DPGen, a program synthesizer that takes in non-private code (without any noise) and automatically synthesizes its differentially private version (with carefully calibrated noise). Under the hood, DPGen uses novel algorithms to automatically generate a sketch program with candidate locations for noise, and then optimize privacy proof and noise scales simultaneously on the sketch program. Moreover, DPGen can synthesize sophisticated mechanisms that adaptively process queries until a specified privacy budget is exhausted. When evaluated on standard benchmarks, DPGen is able to generate differentially private mechanisms that optimize simple utility functions within 120 seconds. It is also powerful enough to synthesize adaptive privacy mechanisms. 
    more » « less
  5. We provide improved differentially private algorithms for identity testing of high-dimensional distributions. Specifically, for d-dimensional Gaussian distributions with known covariance Σ, we can test whether the distribution comes from N(μ∗,Σ) for some fixed μ∗ or from some N(μ,Σ) with total variation distance at least α from N(μ∗,Σ) with (ε,0)-differential privacy, using only O~(d1/2α2+d1/3α4/3⋅ε2/3+1α⋅ε) samples if the algorithm is allowed to be computationally inefficient, and only O~(d1/2α2+d1/4α⋅ε) samples for a computationally efficient algorithm. We also provide a matching lower bound showing that our computationally inefficient algorithm has optimal sample complexity. We also extend our algorithms to various related problems, including mean testing of Gaussians with bounded but unknown covariance, uniformity testing of product distributions over {−1,1}d, and tolerant testing. Our results improve over the previous best work of Canonne et al.~\cite{CanonneKMUZ20} for both computationally efficient and inefficient algorithms, and even our computationally efficient algorithm matches the optimal \emph{non-private} sample complexity of O(d√α2) in many standard parameter settings. In addition, our results show that, surprisingly, private identity testing of d-dimensional Gaussians can be done with fewer samples than private identity testing of discrete distributions over a domain of size d \cite{AcharyaSZ18}, which refutes a conjectured lower bound of~\cite{CanonneKMUZ20}. 
    more » « less