NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Metric Differential Privacy at the User-Level via the Earth-Mover's Distance

https://doi.org/10.1145/3658644.3690363

Imola, Jacob; Roy_Chowdhury, Amrita; Chaudhuri, Kamalika (December 2024, ACM)

Full Text Available
Metric Differential Privacy at the User-Level Via the Earth Mover's Distance

Imola, Jacob; Roy-Chowdhury, Amrita; Chaudhuri, Kamalika (October 2024, Proceedings of the ACM Conference on Computer and Communications Security)

Full Text Available
FairProof : Confidential and Certifiable Fairness for Neural Networks

Yadav, Chhavi; RoyChowdhury, Amrita; Boneh, Dan; Chaudhuri, Kamalika (July 2024, Proceedings of Machine Learning Research)

Machine learning models are increasingly used in societal applications, yet legal and privacy concerns demand that they very often be kept confidential. Consequently, there is a growing distrust about the fairness properties of these models in the minds of consumers, who are often at the receiving end of model predictions. To this end, we propose \name -- a system that uses Zero-Knowledge Proofs (a cryptographic primitive) to publicly verify the fairness of a model, while maintaining confidentiality. We also propose a fairness certification algorithm for fully-connected neural networks which is befitting to ZKPs and is used in this system. We implement \name in Gnark and demonstrate empirically that our system is practically feasible.
more » « less
Full Text Available
FairProof: CONFIDENTIAL AND CERTIFIABLE FAIRNESS FOR NEURAL NETWORKS

Yadav, Chhavi; Roy-Chowdhury, Amrita; Boneh, Dan; Chaudhuri, Kamalika (July 2024, Proceedings of Machine Learning Research)

Machine learning models are increasingly used in societal applications, yet legal and privacy concerns demand that they very often be kept confidential. Consequently, there is a growing distrust about the fairness properties of these models in the minds of consumers, who are often at the receiving end of model predictions. To this end, we propose FairProof– a system that uses Zero-Knowledge Proofs (a cryptographic primitive) to publicly verify the fairness of a model, while maintaining confidentiality. We also propose a fairness certification algorithm for fully-connected neural networks which is befitting to ZKPs and is used in this system. We implement FairProof in Gnark and demonstrate empirically that our system is practically feasible.
more » « less
Full Text Available
FairProof : Confidential and Certifiable Fairness for Neural Networks

Yadav, Chhavi; Chowdhury, Amrita Roy; Boneh, Dan; Chaudhuri, Kamalika (July 2024, Proceedings of the 41st International Conference on Machine Learning, PMLR 235:55682-55705, 2024.)

Full Text Available
On Differentially Private U Statistics

Chaudhuri, Kamalika; Loh, Po-Ling; Pandey, Shourya; Sarkar, Purnamrita (July 2024, https://doi.org/10.48550/arXiv.2407.04945)

We consider the problem of privately estimating a parameter 𝔼[h(X1,…,Xk)], where X1, X2, …, Xk are i.i.d. data from some distribution and h is a permutation-invariant function. Without privacy constraints, standard estimators are U-statistics, which commonly arise in a wide range of problems, including nonparametric signed rank tests, symmetry testing, uniformity testing, and subgraph counts in random networks, and can be shown to be minimum variance unbiased estimators under mild conditions. Despite the recent outpouring of interest in private mean estimation, privatizing U-statistics has received little attention. While existing private mean estimation algorithms can be applied to obtain confidence intervals, we show that they can lead to suboptimal private error, e.g., constant-factor inflation in the leading term, or even Θ(1/n) rather than O(1/n²) in degenerate settings. To remedy this, we propose a new thresholding-based approach using local Hájek projections to reweight different subsets of the data. This leads to nearly optimal private error for non-degenerate U-statistics and a strong indication of near-optimality for degenerate U-statistics.
more » « less
Full Text Available
Differentially Private Multi-Site Treatment Effect Estimation

https://doi.org/10.1109/SaTML59370.2024.00030

Koga, Tatsuki; Chaudhuri, Kamalika; Page, David (April 2024, IEEE)

Full Text Available
Data-Copying in Generative Models: A Formal Framework

Bhattacharjee, Robi; Dasgupta, Sanjoy; Chaudhuri, Kamalika (July 2023, Proceedings of Machine Learning Research)

There has been some recent interest in detecting and addressing memorization of training data by deep neural networks. A formal framework for memorization in generative models, called “data-copying” was proposed by Meehan et. al (2020). We build upon their work to show that their framework may fail to detect certain kinds of blatant memorization. Motivated by this and the theory of non-parametric methods, we provide an alternative definition of data-copying that applies more locally. We provide a method to detect data-copying, and provably show that it works with high probability when enough data is available. We also provide lower bounds that characterize the sample requirement for reliable detection.
more » « less
Full Text Available
Data Redaction from Pre-trained GANs

https://doi.org/10.1109/SaTML54575.2023.00048

Kong, Zhifeng; Chaudhuri, Kamalika (February 2023, IEEE Secure and Trustworthy Machine Learning)

Large pre-trained generative models are known to occasionally output undesirable samples, which undermines their trustworthiness. The common way to mitigate this is to re-train them differently from scratch using different data or different regularization -- which uses a lot of computational resources and does not always fully address the problem. In this work, we take a different, more compute-friendly approach and investigate how to post-edit a model after training so that it ''redacts'', or refrains from outputting certain kinds of samples. We show that redaction is a fundamentally different task from data deletion, and data deletion may not always lead to redaction. We then consider Generative Adversarial Networks (GANs), and provide three different algorithms for data redaction that differ on how the samples to be redacted are described. Extensive evaluations on real-world image datasets show that our algorithms out-perform data deletion baselines, and are capable of redacting data while retaining high generation quality at a fraction of the cost of full re-training.
more » « less
Full Text Available
Robust Empirical Risk Minimization with Tolerance

Bhattacharjee, Robi; Hopkins, Max; Kumar, Akash; Yu, Hantao; Chaudhuri, Kamalika (February 2023, Algorithmic Learning Theory)

Developing simple, sample-efficient learning algorithms for robust classification is a pressing issue in today's tech-dominated world, and current theoretical techniques requiring exponential sample complexity and complicated improper learning rules fall far from answering the need. In this work we study the fundamental paradigm of (robust) empirical risk minimization (RERM), a simple process in which the learner outputs any hypothesis minimizing its training error. RERM famously fails to robustly learn VC classes (Montasser et al., 2019a), a bound we show extends even to `nice' settings such as (bounded) halfspaces. As such, we study a recent relaxation of the robust model called tolerant robust learning (Ashtiani et al., 2022) where the output classifier is compared to the best achievable error over slightly larger perturbation sets. We show that under geometric niceness conditions, a natural tolerant variant of RERM is indeed sufficient for γ-tolerant robust learning VC classes over ℝd, and requires only Õ (VC(H)dlogDγδϵ2) samples for robustness regions of (maximum) diameter D.
more » « less
Full Text Available

« Prev Next »

Search for: All records