Combatting The Challenges of Local Privacy for Distributional Semantics with Compression

Schofield, Alexandra; Yauney, Gregory; Mimno, David

Citation Details

Traditional methods for adding locally private noise to bag-of-words features overwhelm the true signal in the text data, removing the properties of sparsity and non-negativity often relied upon by distributional semantic models. We argue the formulation of limited-precision local privacy, which guarantees privacy between documents of less than a user-specified maximum distance, is a more appropriate framework for bag-of-words features. To reduce the number of features to which we must add random noise, we also compress word features before adding noise, then decompress those features before model inference. We test randomized methods of aggregation as well as methods informed by distributional properties of words. Applying LDA and LSA to synthetic and real data, we show that these approaches produce distributional models closer to those in the original data. more »

Award ID(s):: 1652536

PAR ID:: 10162559

Author(s) / Creator(s):: Schofield, Alexandra; Yauney, Gregory; Mimno, David

Date Published:: 2019-12-01

Journal Name:: PriML workshop at NeurIPS

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this