skip to main content

Title: Adversarial Classification on Social Networks
The spread of unwanted or malicious content through social me- dia has become a major challenge. Traditional examples of this include social network spam, but an important new concern is the propagation of fake news through social media. A common ap- proach for mitigating this problem is by using standard statistical classi cation to distinguish malicious (e.g., fake news) instances from benign (e.g., actual news stories). However, such an approach ignores the fact that malicious instances propagate through the network, which is consequential both in quantifying consequences (e.g., fake news di using through the network), and capturing de- tection redundancy (bad content can be detected at di erent nodes). An additional concern is evasion attacks, whereby the generators of malicious instances modify the nature of these to escape detection. We model this problem as a Stackelberg game between the defender who is choosing parameters of the detection model, and an attacker, who is choosing both the node at which to initiate malicious spread, and the nature of malicious entities. We develop a novel bi-level programming approach for this problem, as well as a novel solution approach based on implicit function gradients, and experimentally demonstrate the advantage of our approach over alternatives which ignore network structure.  more » « less
Award ID(s):
1649972 1640624 1526860 1905558
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
International Conference on Autonomous Agents and Multiagent Systems
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The spread of fake news related to COVID-19 is an infodemic that leads to a public health crisis. Therefore, detecting fake news is crucial for an effective management of the COVID-19 pandemic response. Studies have shown that machine learning models can detect COVID-19 fake news based on the content of news articles. However, the use of biomedical information, which is often featured in COVID-19 news, has not been explored in the development of these models. We present a novel approach for predicting COVID-19 fake news by leveraging biomedical information extraction (BioIE) in combination with machine learning models. We analyzed 1164 COVID-19 news articles and used advanced BioIE algorithms to extract 158 novel features. These features were then used to train 15 machine learning classifiers to predict COVID-19 fake news. Among the 15 classifiers, the random forest model achieved the best performance with an area under the ROC curve (AUC) of 0.882, which is 12.36% to 31.05% higher compared to models trained on traditional features. Furthermore, incorporating BioIE-based features improved the performance of a state-of-the-art multi-modality model (AUC 0.914 vs. 0.887). Our study suggests that incorporating biomedical information into fake news detection models improves their performance, and thus could be a valuable tool in the fight against the COVID-19 infodemic.

    more » « less
  2. As the scourge of “fake news” continues to plague our information environment, attention has turned toward devising automated solutions for detecting problematic online content. But, in order to build reliable algorithms for flagging “fake news,” we will need to go beyond broad definitions of the concept and identify distinguishing features that are specific enough for machine learning. With this objective in mind, we conducted an explication of “fake news” that, as a concept, has ballooned to include more than simply false information, with partisans weaponizing it to cast aspersions on the veracity of claims made by those who are politically opposed to them. We identify seven different types of online content under the label of “fake news” (false news, polarized content, satire, misreporting, commentary, persuasive information, and citizen journalism) and contrast them with “real news” by introducing a taxonomy of operational indicators in four domains—message, source, structure, and network—that together can help disambiguate the nature of online news content.

    more » « less
  3. null (Ed.)
    Today social media has become the primary source for news. Via social media platforms, fake news travel at unprecedented speeds, reach global audiences and put users and communities at great risk. Therefore, it is extremely important to detect fake news as early as possible. Recently, deep learning based approaches have shown improved performance in fake news detection. However, the training of such models requires a large amount of labeled data, but manual annotation is time-consuming and expensive. Moreover, due to the dynamic nature of news, annotated samples may become outdated quickly and cannot represent the news articles on newly emerged events. Therefore, how to obtain fresh and high-quality labeled samples is the major challenge in employing deep learning models for fake news detection. In order to tackle this challenge, we propose a reinforced weakly-supervised fake news detection framework, i.e., WeFEND, which can leverage users' reports as weak supervision to enlarge the amount of training data for fake news detection. The proposed framework consists of three main components: the annotator, the reinforced selector and the fake news detector. The annotator can automatically assign weak labels for unlabeled news based on users' reports. The reinforced selector using reinforcement learning techniques chooses high-quality samples from the weakly labeled data and filters out those low-quality ones that may degrade the detector's prediction performance. The fake news detector aims to identify fake news based on the news content. We tested the proposed framework on a large collection of news articles published via WeChat official accounts and associated user reports. Extensive experiments on this dataset show that the proposed WeFEND model achieves the best performance compared with the state-of-the-art methods. 
    more » « less
  4. A fundamental challenge in networked systems is detection and removal of suspected malicious nodes. In reality, detection is always imperfect, and the decision about which potentially malicious nodes to remove must trade off false positives (erroneously removing benign nodes) and false negatives (mistakenly failing to remove malicious nodes). However, in network settings this conventional tradeoff must now account for node connectivity. In particular, malicious nodes may exert malicious influence, so that mistakenly leaving some of these in the network may cause damage to spread. On the other hand, removing benign nodes causes direct harm to these, and indirect harm to their benign neighbors who would wish to communicate with them. We formalize the problem of removing potentially malicious nodes from a network under uncertainty through an objective that takes connectivity into account. We show that optimally solving the resulting problem is NP-Hard. We then propose a tractable solution approach based on a convex relaxation of the objective. Finally, we experimentally demonstrate that our approach significantly outperforms both a simple baseline that ignores network structure, as well as a state-of-the-art approach for a related problem, on both synthetic and real-world datasets. 
    more » « less
  5. null (Ed.)
    We introduce here a multi-type bootstrap percolation model, which we call T -Bootstrap Percolation ( T -BP), and apply it to study information propagation in social networks. In this model, a social network is represented by a graph G whose vertices have different labels corresponding to the type of role the person plays in the network (e.g. a student, an educator etc.). Once an initial set of vertices of G is randomly selected to be carrying a gossip (e.g. to be infected), the gossip propagates to a new vertex provided it is transmitted by a minimum threshold of vertices with different labels. By considering random graphs, which have been shown to closely represent social networks, we study different properties of the T -BP model through numerical simulations, and describe its implications when applied to rumour spread, fake news and marketing strategies. 
    more » « less