MiRAGeNews: Multimodal Realistic AI-Generated News Detection

Huang, Runsheng; Dugan, Liam; Yang, Yue; Callison-Burch, Chris

doi:10.18653/v1/2024.findings-emnlp.959

Citation Details

MiRAGeNews: Multimodal Realistic AI-Generated News Detection

The proliferation of inflammatory or misleading “fake” news content has become increasingly common in recent years. Simultaneously, it has become easier than ever to use AI tools to generate photorealistic images depicting any scene imaginable. Combining these two—AI-generated fake news content—is particularly potent and dangerous. To combat the spread of AI-generated fake news, we propose the MiRAGeNews Dataset, a dataset of 12,500 high-quality real and AI-generated image-caption pairs from state-of-the-art generators. We find that our dataset poses a significant challenge to humans (60% F-1) and state-of-the-art multi-modal LLMs (< 24% F-1). Using our dataset we train a multi-modal detector (MiRAGe) that improves by +5.1% F-1 over state-of-the-art baselines on image-caption pairs from out-of-domain image generators and news publishers. We release our code and data to aid future work on detecting AI-generated content. more »

Award ID(s):: 1928474

PAR ID:: 10563495

Author(s) / Creator(s):: Huang, Runsheng; Dugan, Liam; Yang, Yue; Callison-Burch, Chris

Publisher / Repository:: Association for Computational Linguistics

Date Published:: 2024-01-01

Page Range / eLocation ID:: 16436 to 16448

Subject(s) / Keyword(s):: Detection of AI Generated News

Format(s):: Medium: X

Location:: Miami, Florida, USA

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript
Conference Paper:
https://doi.org/10.18653/v1/2024.findings-emnlp.959

More Like this