SEER: Backdoor Detection for Vision-Language Models through Searching Target Text and Image Trigger Jointly

Zhu, Liuwan; Ning, Rui; Li, Jiang; Xin, Chunsheng; Wu, Hongyi

doi:10.1609/aaai.v38i7.28611

Citation Details

SEER: Backdoor Detection for Vision-Language Models through Searching Target Text and Image Trigger Jointly

This paper proposes SEER, a novel backdoor detection algorithm for vision-language models, addressing the gap in the literature on multi-modal backdoor detection. While backdoor detection in single-modal models has been well studied, the investigation of such defenses in multi-modal models remains limited. Existing backdoor defense mechanisms cannot be directly applied to multi-modal settings due to their increased complexity and search space explosion. In this paper, we propose to detect backdoors in vision-language models by jointly searching image triggers and malicious target texts in feature space shared by vision and language modalities. Our extensive experiments demonstrate that SEER can achieve over 92% detection rate on backdoor detection in vision-language models in various settings without accessing training data or knowledge of downstream tasks. more »

Award ID(s):: 2413009 2320999

PAR ID:: 10608893

Author(s) / Creator(s):: Zhu, Liuwan; Ning, Rui; Li, Jiang; Xin, Chunsheng; Wu, Hongyi

Publisher / Repository:: AAAI-2024-2

Date Published:: 2024-03-25

Journal Name:: Proceedings of the AAAI Conference on Artificial Intelligence

Volume:: 38

Issue:: 7

ISSN:: 2159-5399

Page Range / eLocation ID:: 7766 to 7774

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1609/aaai.v38i7.28611

More Like this