skip to main content

Title: How To Backdoor Federated Learning
Federated learning enables thousands of participants to construct a deep learning model without sharing their private training data with each other. For example, multiple smartphones can jointly train a next-word predictor for keyboards without revealing what individual users type. We demonstrate that any participant in federated learning can introduce hidden backdoor functionality into the joint global model, e.g., to ensure that an image classifier assigns an attacker-chosen label to images with certain features, or that a word predictor completes certain sentences with an attacker-chosen word. We design and evaluate a new model-poisoning methodology based on model replacement. An attacker selected in a single round of federated learning can cause the global model to immediately reach 100% accuracy on the backdoor task. We evaluate the attack under different assumptions for the standard federated-learning tasks and show that it greatly outperforms data poisoning. Our generic constrain-and-scale technique also evades anomaly detection-based defenses by incorporating the evasion into the attacker's loss function during training.
Authors:
; ; ; ;
Award ID(s):
1700832
Publication Date:
NSF-PAR ID:
10249787
Journal Name:
ArXivorg
ISSN:
2331-8422
Sponsoring Org:
National Science Foundation
More Like this
  1. A backdoor data poisoning attack is an adversarial attack wherein the attacker injects several watermarked, mislabeled training examples into a training set. The watermark does not impact the test-time performance of the model on typical data; however, the model reliably errs on watermarked examples. To gain a better foundational understanding of backdoor data poisoning attacks, we present a formal theoretical framework within which one can discuss backdoor data poisoning attacks for classification problems. We then use this to analyze important statistical and computational issues surrounding these attacks. On the statistical front, we identify a parameter we call the memorization capacity that captures the intrinsic vulnerability of a learning problem to a backdoor attack. This allows us to argue about the robustness of several natural learning problems to backdoor attacks. Our results favoring the attacker involve presenting explicit constructions of backdoor attacks, and our robustness results show that some natural problem settings cannot yield successful backdoor attacks. From a computational standpoint, we show that under certain assumptions, adversarial training can detect the presence of backdoors in a training set. We then show that under similar assumptions, two closely related problems we call backdoor filtering and robust generalization are nearly equivalent. Thismore »implies that it is both asymptotically necessary and sufficient to design algorithms that can identify watermarked examples in the training set in order to obtain a learning algorithm that both generalizes well to unseen data and is robust to backdoors.« less
  2. Federated learning (FL) is an emerging machine learning paradigm. With FL, distributed data owners aggregate their model updates to train a shared deep neural network collaboratively, while keeping the training data locally. However, FL has little control over the local data and the training process. Therefore, it is susceptible to poisoning attacks, in which malicious or compromised clients use malicious training data or local updates as the attack vector to poison the trained global model. Moreover, the performance of existing detection and defense mechanisms drops significantly in a scaled-up FL system with non-iid data distributions. In this paper, we propose a defense scheme named CONTRA to defend against poisoning attacks, e.g., label-flipping and backdoor attacks, in FL systems. CONTRA implements a cosine-similarity-based measure to determine the credibility of local model parameters in each round and a reputation scheme to dynamically promote or penalize individual clients based on their per-round and historical contributions to the global model. With extensive experiments, we show that CONTRA significantly reduces the attack success rate while achieving high accuracy with the global model. Compared with a state-of-the-art (SOTA) defense, CONTRA reduces the attack success rate by 70% and reduces the global model performance degradation by 50%.
  3. Recent years have seen the increasing attention and popularity of federated learning (FL), a distributed learning framework for privacy and data security. However, by its fundamental design, federated learning is inherently vulnerable to model poisoning attacks: a malicious client may submit the local updates to influence the weights of the global model. Therefore, detecting malicious clients against model poisoning attacks in federated learning is useful in safety-critical tasks.However, existing methods either fail to analyze potential malicious data or are computationally restrictive. To overcome these weaknesses, we propose a robust federated learning method where the central server learns a supervised anomaly detector using adversarial data generated from a variety of state-of-the-art poisoning attacks. The key idea of this powerful anomaly detector lies in a comprehensive understanding of the benign update through distinguishing it from the diverse malicious ones. The anomaly detector would then be leveraged in the process of federated learning to automate the removal of malicious updates (even from unforeseen attacks).Through extensive experiments, we demonstrate its effectiveness against backdoor attacks, where the attackers inject adversarial triggers such that the global model will make incorrect predictions on the poisoned samples. We have verified that our method can achieve 99.0% detection AUCmore »scores while enjoying longevity as the model converges. Our method has also shown significant advantages over existing robust federated learning methods in all settings. Furthermore, our method can be easily generalized to incorporate newly-developed poisoning attacks, thus accommodating ever-changing adversarial learning environments.« less
  4. Federated learning allows multiple users to collaboratively train a shared classifica- tion model while preserving data privacy. This approach, where model updates are aggregated by a central server, was shown to be vulnerable to poisoning backdoor attacks: a malicious user can alter the shared model to arbitrarily classify specific inputs from a given class. In this paper, we analyze the effects of backdoor attacks on federated meta-learning, where users train a model that can be adapted to dif- ferent sets of output classes using only a few examples. While the ability to adapt could, in principle, make federated learning frameworks more robust to backdoor attacks (when new training examples are benign), we find that even 1-shot attacks can be very successful and persist after additional training. To address these vulner- abilities, we propose a defense mechanism inspired by matching networks, where the class of an input is predicted from the similarity of its features with a support set of labeled examples. By removing the decision logic from the model shared with the federation, success and persistence of backdoor attacks are greatly reduced.
  5. Federated learning allows multiple users to collaboratively train a shared classification model while preserving data privacy. This approach, where model updates are aggregated by a central server, was shown to be vulnerable to poisoning backdoor attacks : a malicious user can alter the shared model to arbitrarily classify specific inputs from a given class. In this article, we analyze the effects of backdoor attacks on federated meta-learning , where users train a model that can be adapted to different sets of output classes using only a few examples. While the ability to adapt could, in principle, make federated learning frameworks more robust to backdoor attacks (when new training examples are benign), we find that even one-shot attacks can be very successful and persist after additional training. To address these vulnerabilities, we propose a defense mechanism inspired by matching networks , where the class of an input is predicted from the similarity of its features with a support set of labeled examples. By removing the decision logic from the model shared with the federation, the success and persistence of backdoor attacks are greatly reduced.