- Award ID(s):
- 1717950
- PAR ID:
- 10302950
- Date Published:
- Journal Name:
- Proceedings on Privacy Enhancing Technologies
- Volume:
- 2021
- Issue:
- 2
- ISSN:
- 2299-0984
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract—A distribution inference attack aims to infer statistical properties of data used to train machine learning models. These attacks are sometimes surprisingly potent, but the factors that impact distribution inference risk are not well understood and demonstrated attacks often rely on strong and unrealistic assumptions such as full knowledge of training environments even in supposedly black-box threat scenarios. To improve understanding of distribution inference risks, we develop a new black-box attack that even outperforms the best known white-box attack in most settings. Using this new attack, we evaluate distribution inference risk while relaxing a variety of assumptions about the adversary’s knowledge under black-box access, like known model architectures and label-only access. Finally, we evaluate the effectiveness of previously proposed defenses and introduce new defenses. We find that although noise-based defenses appear to be ineffective, a simple re-sampling defense can be highly effective. Imore » « less
-
—A distribution inference attack aims to infer statistical properties of data used to train machine learning models. These attacks are sometimes surprisingly potent, but the factors that impact distribution inference risk are not well understood and demonstrated attacks often rely on strong and unrealistic assumptions such as full knowledge of training environments even in supposedly black-box threat scenarios. To improve understanding of distribution inference risks, we develop a new black-box attack that even outperforms the best known white-box attack in most settings. Using this new attack, we evaluate distribution inference risk while relaxing a variety of assumptions about the adversary’s knowledge under black-box access, like known model architectures and label-only access. Finally, we evaluate the effectiveness of previously proposed defenses and introduce new defenses. We find that although noise-based defenses appear to be ineffective, a simple re-sampling defense can be highly effective.more » « less
-
Recent advances in machine learning enable wider applications of prediction models in cyber-physical systems. Smart grids are increasingly using distributed sensor settings for distributed sensor fusion and information processing. Load forecasting systems use these sensors to predict future loads to incorporate into dynamic pricing of power and grid maintenance. However, these inference predictors are highly complex and thus vulnerable to adversarial attacks. Moreover, the adversarial attacks are synthetic norm-bounded modifications to a limited number of sensors that can greatly affect the accuracy of the overall predictor. It can be much cheaper and effective to incorporate elements of security and resilience at the earliest stages of design. In this paper, we demonstrate how to analyze the security and resilience of learning-based prediction models in power distribution networks by utilizing a domain-specific deep-learning and testing framework. This framework is developed using DeepForge and enables rapid design and analysis of attack scenarios against distributed smart meters in a power distribution network. It runs the attack simulations in the cloud backend. In addition to the predictor model, we have integrated an anomaly detector to detect adversarial attacks targeting the predictor. We formulate the stealthy adversarial attacks as an optimization problem to maximize prediction loss while minimizing the required perturbations. Under the worst-case setting, where the attacker has full knowledge of both the predictor and the detector, an iterative attack method has been developed to solve for the adversarial perturbation. We demonstrate the framework capabilities using a GridLAB-D based power distribution network model and show how stealthy adversarial attacks can affect smart grid prediction systems even with a partial control of network.more » « less
-
We consider a hierarchical inference system with multiple clients connected to a server via a shared communication resource. When necessary, clients with low-accuracy machine learning models can offload classification tasks to a server for processing on a high-accuracy model. We propose a distributed online offloading algorithm which maximizes the accuracy subject to a shared resource utilization constraint thus indirectly realizing accuracy-delay tradeoffs possible given an underlying network scheduler. The proposed algorithm, named Lyapunov-EXP4, introduces a loss structure based on Lyapunov-drift minimization techniques to the bandits with expert advice framework. We prove that the algorithm converges to a near-optimal threshold policy on the confidence of the clients’ local inference without prior knowledge of the system’s statistics and efficiently solves a constrained bandit problem with sublinear regret. We further consider settings where clients may employ multiple thresholds, allowing more aggressive optimization of overall accuracy at a possible loss in fairness. Extensive simulation results on real and synthetic data demonstrate convergence of Lyapunov-EXP4, and show themore » « less
-
Graph contrastive learning (GCL) has emerged as a successful method for self-supervised graph learning. It involves generating augmented views of a graph by augmenting its edges and aims to learn node embeddings that are invariant to graph augmentation. Despite its effectiveness, the potential privacy risks associated with GCL models have not been thoroughly explored. In this paper, we delve into the privacy vulnerability of GCL models through the lens of link membership inference attacks (LMIA). Specifically, we focus on the federated setting where the adversary has white-box access to the node embeddings of all the augmented views generated by the target GCL model. Designing such white-box LMIAs against GCL models presents a significant and unique challenge due to potential variations in link memberships among node pairs in the target graph and its augmented views. This variability renders members indistinguishable from non-members when relying solely on the similarity of their node embeddings in the augmented views. To address this challenge, our in-depth analysis reveals that the key distinguishing factor lies in the similarity of node embeddings within augmented views where the node pairs share identical link memberships as those in the training graph. However, this poses a second challenge, as information about whether a node pair has identical link membership in both the training graph and augmented views is only available during the attack training phase. This demands the attack classifier to handle the additional “identical-membership information which is available only for training and not for testing. To overcome this challenge, we propose GCL-LEAK, the first link membership inference attack against GCL models. The key component of GCL-LEAK is a new attack classifier model designed under the “Learning Using Privileged Information (LUPI)” paradigm, where the privileged information of “same-membership” is encoded as part of the attack classifier's structure. Our extensive set of experiments on four representative GCL models showcases the effectiveness of GCL-LEAK. Additionally, we develop two defense mechanisms that introduce perturbation to the node embeddings. Our empirical evaluation demonstrates that both defense mechanisms significantly reduce attack accuracy while preserving the accuracy of GCL models.