Differentially private low-dimensional synthetic data from high-dimensional datasets

He, Yiyun; Strohmer, Thomas; Vershynin, Roman; Zhu, Yizhe

doi:10.1093/imaiai/iaae034

Citation Details

Differentially private low-dimensional synthetic data from high-dimensional datasets

Differentially private synthetic data provide a powerful mechanism to enable data analysis while protecting sensitive information about individuals. However, when the data lie in a high-dimensional space, the accuracy of the synthetic data suffers from the curse of dimensionality. In this paper, we propose a differentially private algorithm to generate low-dimensional synthetic data efficiently from a high-dimensional dataset with a utility guarantee with respect to the Wasserstein distance. A key step of our algorithm is a private principal component analysis (PCA) procedure with a near-optimal accuracy bound that circumvents the curse of dimensionality. Unlike the standard perturbation analysis, our analysis of private PCA works without assuming the spectral gap for the covariance matrix. more »

Award ID(s):: 1954233

PAR ID:: 10616997

Author(s) / Creator(s):: He, Yiyun; Strohmer, Thomas; Vershynin, Roman; Zhu, Yizhe

Publisher / Repository:: Oxford University Press

Date Published:: 2025-01-15

Journal Name:: Information and Inference: A Journal of the IMA

Volume:: 14

Issue:: 1

ISSN:: 2049-8772

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1093/imaiai/iaae034

More Like this