When is Differentially Private Finetuning Private?

Rinberg, Roy; Pawelczyk, Martin

Citation Details

Differential Privacy (DP) is a mathematical definition that enshrines a formal guarantee that the output of a query does not depend greatly on any individual in the dataset. DP does not formalize a notion of "background information" and does not provide a guarantee about how much an output can be identifying to someone who has background information about an individual. In this paper, we argue that privately fine-tuning a pre-trained machine learning model on a private dataset using differential privacy does not always yield meaningful notions of privacy. Simply offering differential privacy guarantees in terms of (ε, δ) is insufficient to ensure human notions privacy, when the original training data is correlated with the fine-tuning dataset. We emphasize that, alongside differential privacy assurances, it is essential to report measures of dataset similarity and model attackability (for which model-size can be a proxy). This is a work in progress; this work is primarily a position piece, arguing for how DP should be used in practice, and what future research needs to be conducted in order to better answer those questions. more »

Award ID(s):: 2218803

PAR ID:: 10642778

Author(s) / Creator(s):: Rinberg, Roy; Pawelczyk, Martin

Publisher / Repository:: Statistical Foundations of LLMs and Foundation Models (NeurIPS 2024 Workshop)

Date Published:: 2024-10-11

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this