skip to main content

Title: Weak Supervision for Fake News Detection via Reinforcement Learning
Today social media has become the primary source for news. Via social media platforms, fake news travel at unprecedented speeds, reach global audiences and put users and communities at great risk. Therefore, it is extremely important to detect fake news as early as possible. Recently, deep learning based approaches have shown improved performance in fake news detection. However, the training of such models requires a large amount of labeled data, but manual annotation is time-consuming and expensive. Moreover, due to the dynamic nature of news, annotated samples may become outdated quickly and cannot represent the news articles on newly emerged events. Therefore, how to obtain fresh and high-quality labeled samples is the major challenge in employing deep learning models for fake news detection. In order to tackle this challenge, we propose a reinforced weakly-supervised fake news detection framework, i.e., WeFEND, which can leverage users' reports as weak supervision to enlarge the amount of training data for fake news detection. The proposed framework consists of three main components: the annotator, the reinforced selector and the fake news detector. The annotator can automatically assign weak labels for unlabeled news based on users' reports. The reinforced selector using reinforcement learning techniques chooses high-quality samples from the weakly labeled data and filters out those low-quality ones that may degrade the detector's prediction performance. The fake news detector aims to identify fake news based on the news content. We tested the proposed framework on a large collection of news articles published via WeChat official accounts and associated user reports. Extensive experiments on this dataset show that the proposed WeFEND model achieves the best performance compared with the state-of-the-art methods.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the AAAI Conference on Artificial Intelligence
Page Range / eLocation ID:
516 to 523
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Obradovic, Zoran (Ed.)
    An efficient fake news detector becomes essential as the accessibility of social media platforms increases rapidly. Previous studies mainly focused on designing the models solely based on individual data sets and might suffer from degradable performance. Therefore, developing a robust model for a combined data set with diverse knowledge becomes crucial. However, designing the model with a combined data set requires extensive training time and sequential workload to obtain optimal performance without having some prior knowledge about the model's parameters. The presented study here will help solve these issues by introducing the unified training strategy to have a base structure for the classifier and all hyperparameters from individual models using a pretrained transformer model. The performance of the proposed model is noted using three publicly available data sets, namely ISOT and others from the Kaggle website. The results indicate that the proposed unified training strategy surpassed the existing models, such as Random Forests, convolutional neural networks, and long short-term memory, with 97% accuracy and achieved the F1 score of 0.97. Furthermore, there was a significant reduction in training time by almost 1.5 to 1.8 × by removing words lower than three letters from the input samples. We also did extensive performance analysis by varying the number of encoder blocks to build compact models and trained on the combined data set. We justify that reducing encoder blocks resulted in lower performance from the obtained results. 
    more » « less
  2. Obeid, Iyad Selesnick (Ed.)
    The Temple University Hospital EEG Corpus (TUEG) [1] is the largest publicly available EEG corpus of its type and currently has over 5,000 subscribers (we currently average 35 new subscribers a week). Several valuable subsets of this corpus have been developed including the Temple University Hospital EEG Seizure Corpus (TUSZ) [2] and the Temple University Hospital EEG Artifact Corpus (TUAR) [3]. TUSZ contains manually annotated seizure events and has been widely used to develop seizure detection and prediction technology [4]. TUAR contains manually annotated artifacts and has been used to improve machine learning performance on seizure detection tasks [5]. In this poster, we will discuss recent improvements made to both corpora that are creating opportunities to improve machine learning performance. Two major concerns that were raised when v1.5.2 of TUSZ was released for the Neureka 2020 Epilepsy Challenge were: (1) the subjects contained in the training, development (validation) and blind evaluation sets were not mutually exclusive, and (2) high frequency seizures were not accurately annotated in all files. Regarding (1), there were 50 subjects in dev, 50 subjects in eval, and 592 subjects in train. There was one subject common to dev and eval, five subjects common to dev and train, and 13 subjects common between eval and train. Though this does not substantially influence performance for the current generation of technology, it could be a problem down the line as technology improves. Therefore, we have rebuilt the partitions of the data so that this overlap was removed. This required augmenting the evaluation and development data sets with new subjects that had not been previously annotated so that the size of these subsets remained approximately the same. Since these annotations were done by a new group of annotators, special care was taken to make sure the new annotators followed the same practices as the previous generations of annotators. Part of our quality control process was to have the new annotators review all previous annotations. This rigorous training coupled with a strict quality control process where annotators review a significant amount of each other’s work ensured that there is high interrater agreement between the two groups (kappa statistic greater than 0.8) [6]. In the process of reviewing this data, we also decided to split long files into a series of smaller segments to facilitate processing of the data. Some subscribers found it difficult to process long files using Python code, which tends to be very memory intensive. We also found it inefficient to manipulate these long files in our annotation tool. In this release, the maximum duration of any single file is limited to 60 mins. This increased the number of edf files in the dev set from 1012 to 1832. Regarding (2), as part of discussions of several issues raised by a few subscribers, we discovered some files only had low frequency epileptiform events annotated (defined as events that ranged in frequency from 2.5 Hz to 3 Hz), while others had events annotated that contained significant frequency content above 3 Hz. Though there were not many files that had this type of activity, it was enough of a concern to necessitate reviewing the entire corpus. An example of an epileptiform seizure event with frequency content higher than 3 Hz is shown in Figure 1. Annotating these additional events slightly increased the number of seizure events. In v1.5.2, there were 673 seizures, while in v1.5.3 there are 1239 events. One of the fertile areas for technology improvements is artifact reduction. Artifacts and slowing constitute the two major error modalities in seizure detection [3]. This was a major reason we developed TUAR. It can be used to evaluate artifact detection and suppression technology as well as multimodal background models that explicitly model artifacts. An issue with TUAR was the practicality of the annotation tags used when there are multiple simultaneous events. An example of such an event is shown in Figure 2. In this section of the file, there is an overlap of eye movement, electrode artifact, and muscle artifact events. We previously annotated such events using a convention that included annotating background along with any artifact that is present. The artifacts present would either be annotated with a single tag (e.g., MUSC) or a coupled artifact tag (e.g., MUSC+ELEC). When multiple channels have background, the tags become crowded and difficult to identify. This is one reason we now support a hierarchical annotation format using XML – annotations can be arbitrarily complex and support overlaps in time. Our annotators also reviewed specific eye movement artifacts (e.g., eye flutter, eyeblinks). Eye movements are often mistaken as seizures due to their similar morphology [7][8]. We have improved our understanding of ocular events and it has allowed us to annotate artifacts in the corpus more carefully. In this poster, we will present statistics on the newest releases of these corpora and discuss the impact these improvements have had on machine learning research. We will compare TUSZ v1.5.3 and TUAR v2.0.0 with previous versions of these corpora. We will release v1.5.3 of TUSZ and v2.0.0 of TUAR in Fall 2021 prior to the symposium. ACKNOWLEDGMENTS Research reported in this publication was most recently supported by the National Science Foundation’s Industrial Innovation and Partnerships (IIP) Research Experience for Undergraduates award number 1827565. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the official views of any of these organizations. REFERENCES [1] I. Obeid and J. Picone, “The Temple University Hospital EEG Data Corpus,” in Augmentation of Brain Function: Facts, Fiction and Controversy. Volume I: Brain-Machine Interfaces, 1st ed., vol. 10, M. A. Lebedev, Ed. Lausanne, Switzerland: Frontiers Media S.A., 2016, pp. 394 398. [2] V. Shah et al., “The Temple University Hospital Seizure Detection Corpus,” Frontiers in Neuroinformatics, vol. 12, pp. 1–6, 2018. [3] A. Hamid et, al., “The Temple University Artifact Corpus: An Annotated Corpus of EEG Artifacts.” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2020, pp. 1-3. [4] Y. Roy, R. Iskander, and J. Picone, “The NeurekaTM 2020 Epilepsy Challenge,” NeuroTechX, 2020. [Online]. Available: [Accessed: 01-Dec-2021]. [5] S. Rahman, A. Hamid, D. Ochal, I. Obeid, and J. Picone, “Improving the Quality of the TUSZ Corpus,” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2020, pp. 1–5. [6] V. Shah, E. von Weltin, T. Ahsan, I. Obeid, and J. Picone, “On the Use of Non-Experts for Generation of High-Quality Annotations of Seizure Events,” Available: https://www.isip.picone [Accessed: 01-Dec-2021]. [7] D. Ochal, S. Rahman, S. Ferrell, T. Elseify, I. Obeid, and J. Picone, “The Temple University Hospital EEG Corpus: Annotation Guidelines,” Philadelphia, Pennsylvania, USA, 2020. [8] D. Strayhorn, “The Atlas of Adult Electroencephalography,” EEG Atlas Online, 2014. [Online]. Availabl 
    more » « less
  3. null (Ed.)
    Abstract Social media have emerged as increasingly popular means and environments for information gathering and propagation. This vigorous growth of social media contributed not only to a pandemic (fast-spreading and far-reaching) of rumors and misinformation, but also to an urgent need for text-based rumor detection strategies. To speed up the detection of misinformation, traditional rumor detection methods based on hand-crafted feature selection need to be replaced by automatic artificial intelligence (AI) approaches. AI decision making systems require to provide explanations in order to assure users of their trustworthiness. Inspired by the thriving development of generative adversarial networks (GANs) on text applications, we propose a GAN-based layered model for rumor detection with explanations. To demonstrate the universality of the proposed approach, we demonstrate its benefits on a gene classification with mutation detection case study. Similarly to the rumor detection, the gene classification can also be formulated as a text-based classification problem. Unlike fake news detection that needs a previously collected verified news database, our model provides explanations in rumor detection based on tweet-level texts only without referring to a verified news database. The layered structure of both generative and discriminative models contributes to the outstanding performance. The layered generators produce rumors by intelligently inserting controversial information in non-rumors, and force the layered discriminators to detect detailed glitches and deduce exactly which parts in the sentence are problematic. On average, in the rumor detection task, our proposed model outperforms state-of-the-art baselines on PHEME dataset by $$26.85\%$$ 26.85 % in terms of macro-f1. The excellent performance of our model for textural sequences is also demonstrated by the gene mutation case study on which it achieves $$72.69\%$$ 72.69 % macro-f1 score. 
    more » « less
  4. Physiological and behavioral data collected from wearable or mobile sensors have been used to estimate self-reported stress levels. Since stress annotation usually relies on self-reports during the study, a limited amount of labeled data can be an obstacle to developing accurate and generalized stress-predicting models. On the other hand, the sensors can continuously capture signals without annotations. This work investigates leveraging unlabeled wearable sensor data for stress detection in the wild. We propose a two-stage semi-supervised learning framework that leverages wearable sensor data to help with stress detection. The proposed structure consists of an auto-encoder pre-training method for learning information from unlabeled data and the consistency regularization approach to enhance the robustness of the model. Besides, we propose a novel active sampling method for selecting unlabeled samples to avoid introducing redundant information to the model. We validate these methods using two datasets with physiological signals and stress labels collected in the wild, as well as four human activity recognition (HAR) datasets to evaluate the generality of the proposed method. Our approach demonstrated competitive results for stress detection, improving stress classification performance by approximately 7% to 10% on the stress detection datasets compared to the baseline supervised learning models. Furthermore, the ablation study we conducted for the HAR tasks supported the effectiveness of our methods. Our approach showed comparable performance to state-of-the-art semi-supervised learning methods for both stress detection and HAR tasks. 
    more » « less
  5. The spread of unwanted or malicious content through social me- dia has become a major challenge. Traditional examples of this include social network spam, but an important new concern is the propagation of fake news through social media. A common ap- proach for mitigating this problem is by using standard statistical classi cation to distinguish malicious (e.g., fake news) instances from benign (e.g., actual news stories). However, such an approach ignores the fact that malicious instances propagate through the network, which is consequential both in quantifying consequences (e.g., fake news di using through the network), and capturing de- tection redundancy (bad content can be detected at di erent nodes). An additional concern is evasion attacks, whereby the generators of malicious instances modify the nature of these to escape detection. We model this problem as a Stackelberg game between the defender who is choosing parameters of the detection model, and an attacker, who is choosing both the node at which to initiate malicious spread, and the nature of malicious entities. We develop a novel bi-level programming approach for this problem, as well as a novel solution approach based on implicit function gradients, and experimentally demonstrate the advantage of our approach over alternatives which ignore network structure. 
    more » « less