Private Sampling: A Noiseless Approach for Generating Differentially Private Synthetic Data
- PAR ID:
- 10356612
- Date Published:
- Journal Name:
- SIAM Journal on Mathematics of Data Science
- Volume:
- 4
- Issue:
- 3
- ISSN:
- 2577-0187
- Page Range / eLocation ID:
- 1082 to 1115
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Differential Privacy (DP) is a mathematical definition that enshrines a formal guarantee that the output of a query does not depend greatly on any individual in the dataset. DP does not formalize a notion of "background information" and does not provide a guarantee about how much an output can be identifying to someone who has background information about an individual. In this paper, we argue that privately fine-tuning a pre-trained machine learning model on a private dataset using differential privacy does not always yield meaningful notions of privacy. Simply offering differential privacy guarantees in terms of (ε, δ) is insufficient to ensure human notions privacy, when the original training data is correlated with the fine-tuning dataset. We emphasize that, alongside differential privacy assurances, it is essential to report measures of dataset similarity and model attackability (for which model-size can be a proxy). This is a work in progress; this work is primarily a position piece, arguing for how DP should be used in practice, and what future research needs to be conducted in order to better answer those questions.more » « less
An official website of the United States government

