Does Differential Privacy Impact Bias in Pretrained Language Models?

Islam, MK; Wang, A; Wang, T; Ji, Y; Fox, J; Zhao, J

Citation Details

Differential privacy (DP) is applied when fine-tuning pre-trained language models (LMs) to limit leakage of training examples. While most DP research has focused on improving a model’s privacy-utility tradeoff, some find that DP can be unfair to or biased against underrepresented groups. In this work, we extensively analyze the impact of DP on bias in LMs. We find differentially private training can increase the model bias against protected groups w.r.t AUC-based bias metrics. DP makes it more difficult for the model to differentiate between the positive and negative examples from the protected groups and other groups in the rest of the population. Our results also show that the impact of DP on bias is affected by both the privacy protection level and the underlying distribution of the dataset. more »

Award ID(s):: 2151597

PAR ID:: 10608315

Author(s) / Creator(s):: Islam, MK; Wang, A; Wang, T; Ji, Y; Fox, J; Zhao, J

Corporate Creator(s):: IEEE

Editor(s):: Wang, H; Xiao, X

Publisher / Repository:: IEEE Data Engineering Bulletin Vol. 48 No. 2, June 2024. ISSN 1053-1238

Date Published:: 2024-06-20

Journal Name:: IEEE Data Engineering Bulletin (Special Issue on Privacy-preserving Data Management) Vol. 48 No. 2, June 2024.

Edition / Version:: 1

Volume:: 2

Issue:: 1053-1238

ISSN:: 1053-1238

Page Range / eLocation ID:: 125-137

Subject(s) / Keyword(s):: Differential Privacy, Language Models

Format(s):: Medium: X Size: 291KB Other: pdf

Size(s):: 291KB

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
The DOI is not currently available.

More Like this