Explanation-based Adversarial Detection with Noise Reduction

Su, Juntao; Yang, Zhou; Ren, Zexin; Jin, Fang

doi:10.1109/BigData62323.2024.10825913

Citation Details

This content will become publicly available on December 15, 2025

Explanation-based Adversarial Detection with Noise Reduction

Deep Neural Networks (DNNs) have achieved tremendous success in various tasks. However, DNNs exhibit uncertainty and unreliability when faced with well-designed adversarial examples, leading to misclassification. To address this, a variety of methods have been proposed to improve the robustness of DNNs by detecting adversarial attacks. In this paper, we combine model explanation techniques with adversarial models to enhance adversarial detection in real-world scenarios. Specifically, we develop a novel adversary-resistant detection framework called EXPLAINER, which utilizes explanation results extracted from explainable learning models. The explanation model in EXPLAINER generates an explanation map that identifies the relevance of input variables to the model’s classification result. Consequently, adversarial examples can be effectively detected by comparing the explanation results of a given sample with its denoised version, without relying on any prior knowledge of attacks. The proposed framework is thoroughly evaluated against different adversarial attacks, and experimental results demonstrate that our approach achieves promising results in white-box attack scenarios. more »

Award ID(s):: 2238700

PAR ID:: 10608608

Author(s) / Creator(s):: Su, Juntao; Yang, Zhou; Ren, Zexin; Jin, Fang

Publisher / Repository:: IEEE

Date Published:: 2024-12-15

ISBN:: 979-8-3503-6248-0

Page Range / eLocation ID:: 6374 to 6378

Subject(s) / Keyword(s):: Adversarial detection Model explanation Noise reduction.

Format(s):: Medium: X

Location:: Washington, DC, USA

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on December 15, 2025
Conference Paper:
https://doi.org/10.1109/BigData62323.2024.10825913

More Like this