- Award ID(s):
- 1702760
- NSF-PAR ID:
- 10093960
- Date Published:
- Journal Name:
- Programming Language Design and Implementation (PLDI)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Differential privacy is a de facto standard for statistical computations over databases that contain private data. Its main and rather surprising strength is to guarantee individual privacy and yet allow for accurate statistical results. Thanks to its mathematical definition, differential privacy is also a natural target for formal analysis. A broad line of work develops and uses logical methods for proving privacy. A more recent and complementary line of work uses statistical methods for finding privacy violations. Although both lines of work are practically successful, they elide the fundamental question of decidability. This paper studies the decidability of differential privacy. We first establish that checking differential privacy is undecidable even if one restricts to programs having a single Boolean input and a single Boolean output. Then, we define a non-trivial class of programs and provide a decision procedure for checking the differential privacy of a program in this class. Our procedure takes as input a program P parametrized by a privacy budget ϵ and either establishes the differential privacy for all possible values of ϵ or generates a counter-example. In addition, our procedure works for both to ϵ-differential privacy and (ϵ, δ)-differential privacy. Technically, the decision procedure is based on a novel and judicious encoding of the semantics of programs in our class into a decidable fragment of the first-order theory of the reals with exponentiation. We implement our procedure and use it for (dis)proving privacy bounds for many well-known examples, including randomized response, histogram, report noisy max and sparse vector.more » « less
-
Automated deductive program synthesis promises to generate executable programs from concise specifications, along with proofs of correctness that can be independently verified using third-party tools. However, an attempt to exercise this promise using existing proof-certification frameworks reveals significant discrepancies in how proof derivations are structured for two different purposes: program synthesis and program verification. These discrepancies make it difficult to use certified verifiers to validate synthesis results, forcing one to write an ad-hoc translation procedure from synthesis proofs to correctness proofs for each verification backend. In this work, we address this challenge in the context of the synthesis and verification of heap-manipulating programs. We present a technique for principled translation of deductive synthesis derivations (a.k.a. source proofs) into deductive target proofs about the synthesised programs in the logics of interactive program verifiers. We showcase our technique by implementing three different certifiers for programs generated via SuSLik, a Separation Logic-based tool for automated synthesis of programs with pointers, in foundational verification frameworks embedded in Coq: Hoare Type Theory (HTT), Iris, and Verified Software Toolchain (VST), producing concise and efficient machine-checkable proofs for characteristic synthesis benchmarks.more » « less
-
Intermediate verification languages like Why3 and Boogie have made it much easier to build program verifiers, transforming the process into a logic compilation problem rather than a proof automation one. Why3 in particular implements a rich logic for program specification with polymorphism, algebraic data types, recursive functions and predicates, and inductive predicates; it translates this logic to over a dozen solvers and proof assistants. Accordingly, it serves as a backend for many tools, including Frama-C, EasyCrypt, and GNATProve for Ada SPARK. But how can we be sure that these tools are correct? The alternate foundational approach, taken by tools like VST and CakeML, provides strong guarantees by implementing the entire toolchain in a proof assistant, but these tools are harder to build and cannot directly take advantage of SMT solver automation. As a first step toward enabling automated tools with similar foundational guarantees, we give a formal semantics in Coq for the logic fragment of Why3. We show that our semantics are useful by giving a correct-by-construction natural deduction proof system for this logic, using this proof system to verify parts of Why3's standard library, and proving sound two of Why3's transformations used to convert terms and formulas into the simpler logics supported by the backend solvers.
-
null (Ed.)Noisy Max and Sparse Vector are selection algorithms for differential privacy and serve as building blocks for more complex algorithms. In this paper we show that both algorithms can release additional information for free (i.e., at no additional privacy cost). Noisy Max is used to return the approximate maximizer among a set of queries. We show that it can also release for free the noisy gap between the approximate maximizer and runner-up. This free information can improve the accuracy of certain subsequent counting queries by up to 50%. Sparse Vector is used to return a set of queries that are approximately larger than a fixed threshold. We show that it can adaptively control its privacy budget (use less budget for queries that are likely to be much larger than the threshold) in order to increase the amount of queries it can process. These results follow from a careful privacy analysis.more » « less
-
null (Ed.)Differential privacy has become a de facto standard for releasing data in a privacy-preserving way. Creating a differentially private algorithm is a process that often starts with a noise-free (nonprivate) algorithm. The designer then decides where to add noise, and how much of it to add. This can be a non-trivial process – if not done carefully, the algorithm might either violate differential privacy or have low utility. In this paper, we present DPGen, a program synthesizer that takes in non-private code (without any noise) and automatically synthesizes its differentially private version (with carefully calibrated noise). Under the hood, DPGen uses novel algorithms to automatically generate a sketch program with candidate locations for noise, and then optimize privacy proof and noise scales simultaneously on the sketch program. Moreover, DPGen can synthesize sophisticated mechanisms that adaptively process queries until a specified privacy budget is exhausted. When evaluated on standard benchmarks, DPGen is able to generate differentially private mechanisms that optimize simple utility functions within 120 seconds. It is also powerful enough to synthesize adaptive privacy mechanisms.more » « less