Verifying political claims is a challenging task, as politicians can use various tactics to subtly misrepresent the facts for their agenda. Existing automatic fact-checking systems fall short here, and their predictions like "half-true" are not very useful in isolation, since it is unclear which parts of a claim are true or false. In this work, we focus on decomposing a complex claim into a comprehensive set of yes-no subquestions whose answers influence the veracity of the claim. We present CLAIMDECOMP, a dataset of decompositions for over 1000 claims. Given a claim and its verification paragraph written by fact-checkers, our trained annotators write subquestions covering both explicit propositions of the original claim and its implicit facets, such as additional political context that changes our view of the claim's veracity. We study whether state-of-the-art pre-trained models can learn to generate such subquestions. Our experiments show that these models generate reasonable questions, but predicting implied subquestions based only on the claim (without consulting other evidence) remains challenging. Nevertheless, we show that predicted subquestions can help identify relevant evidence to fact-check the full claim and derive the veracity through their answers, suggesting that claim decomposition can be a useful piece of a fact-checking pipeline.
more »
« less
This content will become publicly available on December 17, 2023
Comparative Reasoning for Knowledge Graph Fact Checking
Knowledge graph has been widely used in fact checking, owing to its capability to provide crucial background knowledge to help verify claims. Traditional fact checking works mainly focus on analyzing a single claim but have largely ignored analysis on the semantic consistency of pair-wise claims, despite its key importance in the real-world applications, e.g., multimodal fake news detection. This paper proposes a graph neural network based model INSPECTOR for pair-wise fact checking. Given a pair of claims, INSPECTOR aims to detect the potential semantic inconsistency of the input claims. The main idea of INSPECTOR is to use a graph attention neural network to learn a graph embedding for each claim in the pair, then use a tensor neural network to classify this pair of claims as consistent vs. inconsistent. The experiment results show that our algorithm outperforms state-of-the-art methods, with a higher accuracy and a lower variance.
more »
« less
- NSF-PAR ID:
- 10428926
- Date Published:
- Journal Name:
- 2022 IEEE International Conference on Big Data (Big Data)
- Page Range / eLocation ID:
- 2309 to 2312
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Knowledge graph is ubiquitous and plays an important role in many real-world applications, including recommender systems, question answering, fact-checking, and so on. However, most of the knowledge graphs are incomplete which can hamper their practical usage. Fortunately, knowledge graph completion (KGC) can mitigate this problem by inferring missing edges in the knowledge graph according to the existing information. In this paper, we propose a novel KGC method named ABM (Attention-Based Message passing) which focuses on predicting the relation between any two entities in a knowledge graph. The proposed ABM consists of three integral parts, including (1) context embedding, (2) structure embedding, and (3) path embedding. In the context embedding, the proposed ABM generalizes the existing message passing neural network to update the node embedding and the edge embedding to assimilate the knowledge of nodes' neighbors, which captures the relative role information of the edge that we want to predict. In the structure embedding, the proposed method overcomes the shortcomings of the existing GNN method (i.e., most methods ignore the structural similarity between nodes.) by assigning different attention weights to different nodes while doing the aggregation. Path embedding generates paths between any two entities and treats these paths as sequences. Then, the sequence can be used as the input of the Transformer to update the embedding of the knowledge graph to gather the global role of the missing edges. By utilizing these three mutually complementary strategies, the proposed ABM is able to capture both the local and global information which in turn leads to a superb performance. Experiment results show that ABM outperforms baseline methods on a wide range of datasets.more » « less
-
Knowledge graph reasoning plays a pivotal role in many real-world applications, such as network alignment, computational fact-checking, recommendation, and many more. Among these applications, knowledge graph completion (KGC) and multi-hop question answering over knowledge graph (Multi-hop KGQA) are two representative reasoning tasks. In the vast majority of the existing works, the two tasks are considered separately with different models or algorithms. However, we envision that KGC and Multi-hop KGQA are closely related to each other. Therefore, the two tasks will benefit from each other if they are approached adequately. In this work, we propose a neural model named BiNet to jointly handle KGC and multi-hop KGQA, and formulate it as a multi-task learning problem. Specifically, our proposed model leverages a shared embedding space and an answer scoring module, which allows the two tasks to automatically share latent features and learn the interactions between natural language question decoder and answer scoring module. Compared to the existing methods, the proposed BiNet model addresses both multi-hop KGQA and KGC tasks simultaneously with superior performance. Experiment results show that BiNet outperforms state-of-the-art methods on a wide range of KGQA and KGC benchmark datasets.more » « less
-
null (Ed.)We extend evidence-aware claim verification to the context of positive-unlabeled (PU) learning. Existing works assume the truth and the falsity of the claims are known for training and form the task as a supervised learning problem. However, this assumption underestimates the difficulty of collecting false claims; we argue that claim verification is more challenging in the absence of negative labels. We consider a more practical setting, where only a comparatively small number of true claims are labeled and more claims remain unlabeled. Thus, we formulate the claim verification task as a PU learning problem. We decouple learning representation of claim-evidence pair from PU learning and adopt a pre-trained universal language model to encode claim-evidence pairs. We further propose to use the generative adversarial network (GAN) to capture the latent alignment between encoded claim-evidence pair and the truthfulness. We leverage the verification as part of the GAN by extending previous GAN based PU learning. We show that the proposed model achieves the best performance with a small amount of labeled data and is robust to the truthfulness prior estimation. We conduct a thorough analysis of the model selection. The proposed approach performs the best under two practical scenarios: (i) the unlabeled data is more than the labeled data; (ii) and the unlabeled positive data is more than the unlabeled negative data.more » « less
-
With the spread of the SARS-CoV-2, enormous amounts of information about the pandemic are disseminated through social media platforms such as Twitter. Social media posts often leverage the trust readers have in prestigious news agencies and cite news articles as a way of gaining credibility. Nevertheless, it is not always the case that the cited article supports the claim made in the social media post. We present a cross-genre ad hoc pipeline to identify whether the information in a Twitter post (i.e., a “Tweet”) is indeed supported by the cited news article. Our approach is empirically based on a corpus of over 46.86 million Tweets and is divided into two tasks: (i) development of models to detect Tweets containing claim and worth to be fact-checked and (ii) verifying whether the claims made in a Tweet are supported by the newswire article it cites. Unlike previous studies that detect unsubstantiated information by post hoc analysis of the patterns of propagation, we seek to identify reliable support (or the lack of it) before the misinformation begins to spread. We discover that nearly half of the Tweets (43.4%) are not factual and hence not worth checking – a significant filter, given the sheer volume of social media posts on a platform such as Twitter. Moreover, we find that among the Tweets that contain a seemingly factual claim while citing a news article as supporting evidence, at least 1% are not actually supported by the cited news, and are hence misleading.more » « less