BagFlip: A Certified Defense Against Data Poisoning

Zhang, Yuhao; Albarghouthi, Aws; D'Antoni, Loris

Citation Details

Machine learning models are vulnerable to data-poisoning attacks, in which an attacker maliciously modifies the training set to change the prediction of a learned model. In a trigger-less attack, the attacker can modify the training set but not the test inputs, while in a backdoor attack the attacker can also modify test inputs. Existing model-agnostic defense approaches either cannot handle backdoor attacks or do not provide effective certificates (i.e., a proof of a defense). We present BagFlip, a model-agnostic certified approach that can effectively defend against both trigger-less and backdoor attacks. We evaluate BagFlip on image classification and malware detection datasets. BagFlip is equal to or more effective than the state-of-the-art approaches for trigger-less attacks and more effective than the state-of-the-art approaches for backdoor attacks. more »

Award ID(s):: 1918211

NSF-PAR ID:: 10467904

Author(s) / Creator(s):: Zhang, Yuhao; Albarghouthi, Aws; D'Antoni, Loris

Publisher / Repository:: OpenReview.net

Date Published:: 2022-10-31

Format(s):: Medium: X

Location:: Neural Information Processing

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this