Distribution inference, sometimes called property inference, infers statistical properties about a training set from access to a model trained on that data. Distribution inference attacks can pose serious risks when models are trained on private data, but are difficult to distinguish from the intrinsic purpose of statistical machine learning—namely, to produce models that capture statistical properties about a distribution. Motivated by Yeom et al.’s membership inference framework, we propose a formal definition of distribution inference attacks general enough to describe a broad class of attacks distinguishing between possible training distributions. We show how our definition captures previous ratiobased inference attacks as well as new kinds of attack including revealing the average node degree or clustering coefficient of training graphs. To understand distribution inference risks, we introduce a metric that quantifies observed leakage by relating it to the leakage that would occur if samples from the training distribution were provided directly to the adversary. We report on a series of experiments across a range of different distributions using both novel blackbox attacks and improved versions of the stateoftheart whitebox attacks. Our results show that inexpensive attacks are often as effective as expensive metaclassifier attacks, and that there are surprising asymmetries in the effectiveness of attacks.
Formalizing and Estimating Distribution Inference Risks
Distribution inference, sometimes called property inference, infers statistical properties about a training set from
access to a model trained on that data. Distribution inference attacks can pose serious risks when models are trained on private data, but are difficult to distinguish from the intrinsic purpose of statistical machine learning—namely, to produce models that capture statistical properties about a distribution. Motivated by Yeom et al.’s membership inference framework, we propose a formal definition of distribution inference attacks that is general enough to describe a broad class of attacks distinguishing between possible training distributions. We show how our definition captures previous ratiobased property inference attacks as well as new kinds of attack including revealing the average node degree or clustering coefficient of a training graph. To understand distribution inference risks, we introduce a metric that quantifies observed leakage by relating it to the leakage that would occur if samples from the training distribution were provided directly to the adversary. We report on a series of experiments across a range of different distributions using both novel blackbox attacks and improved versions of the stateoftheart whitebox attacks. Our results show that inexpensive attacks are often as effective as expensive metaclassifier attacks, and that there are surprising asymmetries in the effectiveness of attacks.
more »
« less
 Award ID(s):
 2343611
 NSFPAR ID:
 10472135
 Publisher / Repository:
 arXiv:2109.06024
 Date Published:
 Subject(s) / Keyword(s):
 Machine Learning (cs.LG) Artificial Intelligence (cs.AI) Cryptography and Security (cs.CR)
 Format(s):
 Medium: X
 Location:
 Accepted at PETS 2022 arXiv:2109.06024
 Sponsoring Org:
 National Science Foundation
More Like this


Property inference attacks reveal statistical properties about a training set but are difficult to distinguish from the primary purposes of statistical machine learning, which is to produce models that capture statistical properties about a distribution. Motivated by Yeom et al.’s membership inference framework, we propose a formal and generic definition of property inference attacks. The proposed notion describes attacks that can distinguish between possible training distributions, extending beyond previous property inference attacks that infer the ratio of a particular type of data in the training data set. In this paper, we show how our definition captures previous property inference attacks as well as a new attack that reveals the average degree of nodes of a training graph and report on experiments giving insight into the potential risks of property inference attacks.more » « less

Property inference attacks reveal statistical properties about a training set but are difficult to distinguish from the primary purposes of statistical machine learning, which is to produce models that capture statistical properties about a distribution. Motivated by Yeom et al.’s membership inference framework, we propose a formal and generic definition of property inference attacks. The proposed notion describes attacks that can distinguish between possible training distributions, extending beyond previous property inference attacks that infer the ratio of a particular type of data in the training data set. In this paper, we show how our definition captures previous property inference attacks as well as a new attack that reveals the average degree of nodes of a training graph and report on experiments giving insight into the potential risks of property inference attacks.more » « less

—A distribution inference attack aims to infer statistical properties of data used to train machine learning models. These attacks are sometimes surprisingly potent, but the factors that impact distribution inference risk are not well understood and demonstrated attacks often rely on strong and unrealistic assumptions such as full knowledge of training environments even in supposedly blackbox threat scenarios. To improve understanding of distribution inference risks, we develop a new blackbox attack that even outperforms the best known whitebox attack in most settings. Using this new attack, we evaluate distribution inference risk while relaxing a variety of assumptions about the adversary’s knowledge under blackbox access, like known model architectures and labelonly access. Finally, we evaluate the effectiveness of previously proposed defenses and introduce new defenses. We find that although noisebased defenses appear to be ineffective, a simple resampling defense can be highly effective.more » « less

Abstract—A distribution inference attack aims to infer statistical properties of data used to train machine learning models. These attacks are sometimes surprisingly potent, but the factors that impact distribution inference risk are not well understood and demonstrated attacks often rely on strong and unrealistic assumptions such as full knowledge of training environments even in supposedly blackbox threat scenarios. To improve understanding of distribution inference risks, we develop a new blackbox attack that even outperforms the best known whitebox attack in most settings. Using this new attack, we evaluate distribution inference risk while relaxing a variety of assumptions about the adversary’s knowledge under blackbox access, like known model architectures and labelonly access. Finally, we evaluate the effectiveness of previously proposed defenses and introduce new defenses. We find that although noisebased defenses appear to be ineffective, a simple resampling defense can be highly effective. Imore » « less