Evaluating Robustness of Sequence-based Deepfake Detector Models by Adversarial Perturbation

Shahriyar, Shaikh Akib; Wright, Matthew

doi:10.1145/3494109.3527194

Citation Details

Evaluating Robustness of Sequence-based Deepfake Detector Models by Adversarial Perturbation

Deepfake videos are getting better in quality and can be used for dangerous disinformation campaigns. The pressing need to detect these videos has motivated researchers to develop different types of detection models. Among them, the models that utilize temporal information (i.e., sequence-based models) are more effective at detection than the ones that only detect intra-frame discrepancies. Recent work has shown that the latter detection models can be fooled with adversarial examples, leveraging the rich literature on crafting adversarial (still) images. It is less clear, however, how well these attacks will work on sequence-based models that operate on information taken over multiple frames. In this paper, we explore the effectiveness of the Fast Gradient Sign Method (FGSM) and the Carlini-Wagner 𝐿2-norm attack to fool sequence-based deepfake detector models in both the white-box and black-box settings. The experimental results show that the attacks are effective with a maximum success rate of 99.72% and 67.14% in the white-box and black-box attack scenarios, respectively. This highlights the importance of developing more robust sequence-based deepfake detectors and opens up directions for future research. more »

Award ID(s):: 2040209

NSF-PAR ID:: 10354760

Author(s) / Creator(s):: Shahriyar, Shaikh Akib; Wright, Matthew

Date Published:: 2022-05-30

Journal Name:: Proceedings of the 1st Workshop on Security Implications of Deepfakes and Cheapfakes

Page Range / eLocation ID:: 13 to 18

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3494109.3527194

More Like this