Real-time Factuality Assessment from Adversarial Feedback

Chen, Sanxing; Huang, Yukun; Dhingra, Bhuwan

Citation Details

This content will become publicly available on July 1, 2026

Real-time Factuality Assessment from Adversarial Feedback

We show that existing evaluations for assessing the factuality of news from conventional sources, such as claims on fact-checking websites, result in high accuracies over time for LLM-based detectors—even after their knowledge cutoffs. This suggests that recent popular false information from such sources can be easily identified due to its likely presence in pre-training/retrieval corpora or the emergence of salient, yet shallow, patterns in these datasets. Instead, we argue that a proper factuality evaluation dataset should test a model’s ability to reason about current events by retrieving and reading related evidence. To this end, we develop a novel pipeline that leverages natural language feedback from a RAG-based detector to iteratively modify real-time news into deceptive variants that challenge LLMs. Our iterative rewrite decreases the binary classification ROC-AUC by an absolute 17.5 percent for a strong RAG-based GPT-4o detector. Our experiments reveal the important role of RAG in both evaluating and generating challenging news examples, as retrieval-free LLM detectors are vulnerable to unseen events and adversarial attacks, while feedback from RAG-based evaluation helps discover more deceitful patterns. more »

Award ID(s):: 2211526

PAR ID:: 10623898

Author(s) / Creator(s):: Chen, Sanxing; Huang, Yukun; Dhingra, Bhuwan

Publisher / Repository:: Association for Computational Linguistics

Date Published:: 2025-07-01

Format(s):: Medium: X

Location:: Vienna, Austria

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on July 1, 2026
Conference Paper:
The DOI is not currently available.

More Like this