skip to main content


This content will become publicly available on May 1, 2024

Title: Approximating Functions with Approximate Privacy for Applications in Signal Estimation and Learning

Large corporations, government entities and institutions such as hospitals and census bureaus routinely collect our personal and sensitive information for providing services. A key technological challenge is designing algorithms for these services that provide useful results, while simultaneously maintaining the privacy of the individuals whose data are being shared. Differential privacy (DP) is a cryptographically motivated and mathematically rigorous approach for addressing this challenge. Under DP, a randomized algorithm provides privacy guarantees by approximating the desired functionality, leading to a privacy–utility trade-off. Strong (pure DP) privacy guarantees are often costly in terms of utility. Motivated by the need for a more efficient mechanism with better privacy–utility trade-off, we propose Gaussian FM, an improvement to the functional mechanism (FM) that offers higher utility at the expense of a weakened (approximate) DP guarantee. We analytically show that the proposed Gaussian FM algorithm can offer orders of magnitude smaller noise compared to the existing FM algorithms. We further extend our Gaussian FM algorithm to decentralized-data settings by incorporating the CAPE protocol and propose capeFM. Our method can offer the same level of utility as its centralized counterparts for a range of parameter choices. We empirically show that our proposed algorithms outperform existing state-of-the-art approaches on synthetic and real datasets.

 
more » « less
Award ID(s):
2148104
NSF-PAR ID:
10474206
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
MDPI
Date Published:
Journal Name:
Entropy
Volume:
25
Issue:
5
ISSN:
1099-4300
Page Range / eLocation ID:
825
Subject(s) / Keyword(s):
["federated learning","differential privacy","objective perturbation"]
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Marc'Aurelio Ranzato, Alina Beygelzimer (Ed.)
    Implementations of the exponential mechanism in differential privacy often require sampling from intractable distributions. When approximate procedures like Markov chain Monte Carlo (MCMC) are used, the end result incurs costs to both privacy and accuracy. Existing work has examined these effects asymptotically, but implementable finite sample results are needed in practice so that users can specify privacy budgets in advance and implement samplers with exact privacy guarantees. In this paper, we use tools from ergodic theory and perfect simulation to design exact finite runtime sampling algorithms for the exponential mechanism by introducing an intermediate modified target distribution using artificial atoms. We propose an additional modification of this sampling algorithm that maintains its ǫ-DP guarantee and has improved runtime at the cost of some utility. We then compare these methods in scenarios where we can explicitly calculate a δ cost (as in (ǫ, δ)-DP) incurred when using standard MCMC techniques. Much as there is a well known trade-off between privacy and utility, we demonstrate that there is also a trade-off between privacy guarantees and runtime. 
    more » « less
  2. We provide an end-to-end Renyi DP based-framework for differentially private top-𝑘 selection. Unlike previous approaches, which require a data-independent choice on 𝑘, we propose to privately release a data-dependent choice of 𝑘 such that the gap between 𝑘-th and the (𝑘+1)st “quality” is large. This is achieved by an extension of the Report-Noisy-Max algorithm with a more concentrated Gaussian noise. Not only does this eliminates one hyperparameter, the adaptive choice of 𝑘 also certifies the stability of the top-𝑘 indices in the unordered set so we can release them using a combination of the propose-test-release (PTR) framework and the Distance-to-Stability mechanism. We show that our construction improves the privacy-utility trade-offs compared to the previous top-𝑘 selection algorithms theoretically and empirically. Additionally, we apply our algorithm to “Private Aggregation of Teacher Ensembles (PATE)” in multi-label classification tasks with a large number of labels and show that it leads to significant performance gains. 
    more » « less
  3. In this paper, we aim to develop a scalable algorithm to preserve differential privacy (DP) in adversarial learning for deep neural networks (DNNs), with certified robustness to adversarial examples. By leveraging the sequential composition theory in DP, we randomize both input and latent spaces to strengthen our certified robustness bounds. To address the trade-off among model utility, privacy loss, and robustness, we design an original adversarial objective function, based on the post-processing property in DP, to tighten the sensitivity of our model. A new stochastic batch training is proposed to apply our mechanism on large DNNs and datasets, by bypassing the vanilla iterative batch-by-batch training in DP DNNs. An end-to-end theoretical analysis and evaluations show that our mechanism notably improves the robustness and scalability of DP DNNs. 
    more » « less
  4. null (Ed.)
    In this paper, we aim to develop a scalable algorithm to preserve differential privacy (DP) in adversarial learning for deep neural networks (DNNs), with certified robustness to adversarial examples. By leveraging the sequential composition theory in DP, we randomize both input and latent spaces to strengthen our certified robustness bounds. To address the trade-off among model utility, privacy loss, and robustness, we design an original adversarial objective function, based on the post-processing property in DP, to tighten the sensitivity of our model. A new stochastic batch training is proposed to apply our mechanism on large DNNs and datasets, by bypassing the vanilla iterative batch-by-batch training in DP DNNs. An end-to-end theoretical analysis and evaluations show that our mechanism notably improves the robustness and scalability of DP DNNs. 
    more » « less
  5. techniques to protect user data privacy. A common way for utilizing private data under DP is to take an input dataset and synthesize a new dataset that preserves features of the input dataset while satisfying DP. A trade-off always exists between the strength of privacy protection and the utility of the final output: stronger privacy protection requires larger randomness, so the outputs usually have a larger variance and can be far from optimal. In this paper, we summarize our proposed metric for the NIST “A Better Meter Stick for Differential Privacy” competition [26], MarGinal Difference (MGD), for measuring the utility of a synthesized dataset. Our metric is based on earth mover distance. We introduce new features in our metric so that it is not affected by some small random noise that is unavoidable in the DP context but focuses more on the significant difference. We show that our metric can reflect the range query error better compared with other existing metrics. We introduce an efficient computation method based on the min-cost flow to alleviate the high computation cost of the earth mover’s distance. 
    more » « less