Abstract Machine unlearning is a cutting‐edge technology that embodies the privacy legal principle of the right to be forgotten within the realm of machine learning (ML). It aims to remove specific data or knowledge from trained models without retraining from scratch and has gained significant attention in the field of artificial intelligence in recent years. However, the development of machine unlearning research is associated with inherent vulnerabilities and threats, posing significant challenges for researchers and practitioners. In this article, we provide the first comprehensive survey of security and privacy issues associated with machine unlearning by providing a systematic classification across different levels and criteria. Specifically, we begin by investigating unlearning‐based security attacks, where adversaries exploit vulnerabilities in the unlearning process to compromise the security of machine learning (ML) models. We then conduct a thorough examination of privacy risks associated with the adoption of machine unlearning. Additionally, we explore existing countermeasures and mitigation strategies designed to protect models from malicious unlearning‐based attacks targeting both security and privacy. Further, we provide a detailed comparison between machine unlearning‐based security and privacy attacks and traditional malicious attacks. Finally, we discuss promising future research directions for security and privacy issues posed by machine unlearning, offering insights into potential solutions and advancements in this evolving field.
more »
« less
Deletion inference, reconstruction, and compliance in machine (un)learning
Privacy attacks on machine learning models aim to identify the data that is used to train such models. Such attacks, traditionally, are studied on static models that are trained once and are accessible by the adversary. Motivated to meet new legal requirements, many machine learning methods are recently extended to support machine unlearning, i.e., updating models as if certain examples are removed from their training sets, and meet new legal requirements. However, privacy attacks could potentially become more devastating in this new setting, since an attacker could now access both the original model before deletion and the new model after the deletion. In fact, the very act of deletion might make the deleted record more vulnerable to privacy attacks. Inspired by cryptographic definitions and the differential privacy framework, we formally study privacy implications of machine unlearning. We formalize (various forms of) deletion inference and deletion reconstruction attacks, in which the adversary aims to either identify which record is deleted or to reconstruct (perhaps part of) the deleted records. We then present successful deletion inference and reconstruction attacks for a variety of machine learning models and tasks such as classification, regression, and language models. Finally, we show that our attacks would provably be precluded if the schemes satisfy (variants of) deletion compliance (Garg, Goldwasser, and Vasudevan, Eurocrypt’20).
more »
« less
- Award ID(s):
- 1936826
- PAR ID:
- 10382327
- Date Published:
- Journal Name:
- Proceedings on Privacy Enhancing Technologies
- Volume:
- 2022
- Issue:
- 3
- ISSN:
- 2299-0984
- Page Range / eLocation ID:
- 415 to 436
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract—Deploying machine learning models in production may allow adversaries to infer sensitive information about training data. There is a vast literature analyzing different types of inference risks, ranging from membership inference to reconstruction attacks. Inspired by the success of games (i.e. probabilistic experiments) to study security properties in cryptography, some authors describe privacy inference risks in machine learning using a similar game-based style. However, adversary capabilities and goals are often stated in subtly different ways from one presentation to the other, which makes it hard to relate and compose results. In this paper, we present a game-based framework to systematize the body of knowledge on privacy inference risks in machine learning. We use this framework to (1) provide a unifying structure for definitions of inference risks, (2) formally establish known relations among definitions, and (3) to uncover hitherto unknown relations that would have been difficult to spot otherwise. Index Terms—privacy, machine learning, differential privacy, membership inference, attribute inference, property inferencemore » « less
-
Given the availability of abundant data, deep learning models have been advanced and become ubiquitous in the past decade. In practice, due to many different reasons (e.g., privacy, usability, and fidelity), individuals also want the trained deep models to forget some specific data. Motivated by this, machine unlearning (also known as selective data forgetting) has been intensively studied, which aims at removing the influence that any particular training sample had on the trained model during the unlearning process. However, people usually employ machine unlearning methods as trusted basic tools and rarely have any doubt about their reliability. In fact, the increasingly critical role of machine unlearning makes deep learning models susceptible to the risk of being maliciously attacked. To well understand the performance of deep learning models in malicious environments, we believe that it is critical to study the robustness of deep learning models to malicious unlearning attacks, which happen during the unlearning process. To bridge this gap, in this paper, we first demonstrate that malicious unlearning attacks pose immense threats to the security of deep learning systems. Specifically, we present a broad class of malicious unlearning attacks wherein maliciously crafted unlearning requests trigger deep learning models to misbehave on target samples in a highly controllable and predictable manner. In addition, to improve the robustness of deep learning models, we also present a general defense mechanism, which aims to identify and unlearn effective malicious unlearning requests based on their gradient influence on the unlearned models. Further, theoretical analyses are conducted to analyze the proposed methods. Extensive experiments on real-world datasets validate the vulnerabilities of deep learning models to malicious unlearning attacks and the effectiveness of the introduced defense mechanism.more » « less
-
null (Ed.)The prevalence of deep learning has drawn attention to the privacy protection of sensitive data. Various privacy threats have been presented, where an adversary can steal model owners' private data. Meanwhile, countermeasures have also been introduced to achieve privacy-preserving deep learning. However, most studies only focused on data privacy during training, and ignored privacy during inference. In this paper, we devise a new set of attacks to compromise the inference data privacy in collaborative deep learning systems. Specifically, when a deep neural network and the corresponding inference task are split and distributed to different participants, one malicious participant can accurately recover an arbitrary input fed into this system, even if he has no access to other participants' data or computations, or to prediction APIs to query this system. We evaluate our attacks under different settings, models and datasets, to show their effectiveness and generalization. We also study the characteristics of deep learning models that make them susceptible to such inference privacy threats. This provides insights and guidelines to develop more privacy-preserving collaborative systems and algorithms.more » « less
-
Over-sharing poorly-worded thoughts and personal information is prevalent on online social platforms. In many of these cases, users regret posting such content. To retrospectively rectify these errors in users' sharing decisions, most platforms offer (deletion) mechanisms to withdraw the content, and social media users often utilize them. Ironically and perhaps unfortunately, these deletions make users more susceptible to privacy violations by malicious actors who specifically hunt post deletions at large scale. The reason for such hunting is simple: deleting a post acts as a powerful signal that the post might be damaging to its owner. Today, multiple archival services are already scanning social media for these deleted posts. Moreover, as we demonstrate in this work, powerful machine learning models can detect damaging deletions at scale. Towards restraining such a global adversary against users' right to be forgotten, we introduce Deceptive Deletion, a decoy mechanism that minimizes the adversarial advantage. Our mechanism injects decoy deletions, hence creating a two-player minmax game between an adversary that seeks to classify damaging content among the deleted posts and a challenger that employs decoy deletions to masquerade real damaging deletions. We formalize the Deceptive Game between the two players, determine conditions under which either the adversary or the challenger provably wins the game, and discuss the scenarios in-between these two extremes. We apply the Deceptive Deletion mechanism to a real-world task on Twitter: hiding damaging tweet deletions. We show that a powerful global adversary can be beaten by a powerful challenger, raising the bar significantly and giving a glimmer of hope in the ability to be really forgotten on social platforms.more » « less
An official website of the United States government

