Differentially private estimation of U statistics

Kamalika Chaudhuri, Po-Ling Loh

Citation Details

In this paper, we consider the problem of private estimation of U statistics. U statistics are widely used estimators that naturally arise in a broad class of problems, from nonparametric signed rank tests to subgraph counts in random networks. They are simply averages of an appropriate kernel applied to all subsets of a given size (also known as the degree) of sample size n. However, despite the recent outpouring of interest in private mean estimation, private algorithms for more general U statistics have received little attention. We propose a framework where, for a broad class of U statistics, one can use existing tools in private mean estimation to obtain confidence intervals where the private error does not overwhelm the irreducible error resulting from the variance of the U statistics. However, in specific cases that arise when the U statistics degenerate or have vanishing moments, the private error may be of a larger order than the non-private error. To remedy this, we propose a new thresholding-based approach that uses Hajek projections to re-weight different subsets. As we show, this leads to more accurate inference in certain settings. more »

Award ID(s):: 2019844

PAR ID:: 10503105

Author(s) / Creator(s):: Kamalika Chaudhuri, Po-Ling Loh

Editor(s):: Under Review for COLT 2024

Publisher / Repository:: Proceedings of Machine Learning Research

Date Published:: 2024-01-01

Journal Name:: Proceedings of Machine Learning Research

Volume:: 196

ISSN:: 2640-3498

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
The DOI is not currently available.

More Like this