A non-parametric test to detect data-copying in generative models

Meehan, C.; Chaudhuri, K.; Dasgupta, S.

Citation Details

Detecting overfitting in generative models is an important challenge in machine learning. In this work, we formalize a form of overfitting that we call data-copying – where the gener- ative model memorizes and outputs training samples or small variations thereof. We pro- vide a three sample non-parametric test for detecting data-copying that uses the training set, a separate sample from the target dis- tribution, and a generated sample from the model, and study the performance of our test on several canonical models and datasets. more »

Award ID(s):: 1813160

PAR ID:: 10168813

Author(s) / Creator(s):: Meehan, C.; Chaudhuri, K.; Dasgupta, S.

Date Published:: 2020-07-01

Journal Name:: International Conference on Artificial Intelligence and Statistics

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this