skip to main content

Title: Detection of Fraudulent Tweets: An Empirical Investigation Using Network Analysis and Deep Learning Technique
Social media has become a powerful and efficient platform for information diffusion. The increasing pervasiveness of social media use, however, has brought about the problems of fraudulent accounts that are intended to diffuse misinformation or malicious contents. Twitter recently released comprehensive archives of fraudulent tweets that are possibly connected to a propaganda effort of Internet Research Agency (IRA) on the 2016 U.S. presidential election. To understand information diffusion in fraudulent networks, we analyze structural properties of the IRA retweet network, and develop deep neural network models to detect fraudulent tweets. The structure analysis reveals key characteristics of the fraudulent network. The experiment results demonstrate the superior performance of the deep learning technique to a traditional classification method in detecting fraudulent tweets. The findings have potential implications for curbing online misinformation.
Authors:
; ;
Award ID(s):
1912898
Publication Date:
NSF-PAR ID:
10095445
Journal Name:
IEEE International Conference on Intelligence and Security Informatics
Sponsoring Org:
National Science Foundation
More Like this
  1. During COVID-19, misinformation on social media affects the adoption of appropriate prevention behaviors. It is urgent to suppress the misinformation to prevent negative public health consequences. Although an array of studies has proposed misinformation suppression strategies, few have investigated the role of predominant credible information during crises. None has examined its effect quantitatively using longitudinal social media data. Therefore, this research investigates the temporal correlations between credible information and misinformation, and whether predominant credible information can suppress misinformation for two prevention measures (i.e. topics), i.e. wearing masks and social distancing using tweets collected from February 15 to June 30, 2020.more »We trained Support Vector Machine classifiers to retrieve relevant tweets and classify tweets containing credible information and misinformation for each topic. Based on cross-correlation analyses of credible and misinformation time series for both topics, we find that the previously predominant credible information can lead to the decrease of misinformation (i.e. suppression) with a time lag. The research findings provide empirical evidence for suppressing misinformation with credible information in complex online environments and suggest practical strategies for future information management during crises and emergencies.« less
  2. Social media is being increasingly utilized to spread breaking news and updates during disasters of all magnitudes. Unfortunately, due to the unmoderated nature of social media platforms such as Twitter, rumors and misinformation are able to propagate widely. Given this, a surfeit of research has studied rumor diffusion on social media, especially during natural disasters. In many studies, researchers manually code social media data to further analyze the patterns and diffusion dynamics of users and misinformation. This method requires many human hours, and is prone to significant incorrect classifications if the work is not checked over by another individual. Inmore »our studies, we fill the research gap by applying seven different machine learning algorithms to automatically classify misinformed Twitter data that is spread during disaster events. Due to the unbalanced nature of the data, three different balancing algorithms are also applied and compared. We collect and drive the classifiers with data from the Manchester Arena bombing (2017), Hurricane Harvey (2017), the Hawaiian incoming missile alert (2018), and the East Coast US tsunami alert (2018). Over 20,000 tweets are classified based on the veracity of their content as either true, false, or neutral, with overall accuracies exceeding 89%.« less
  3. The ongoing pandemic has heightened the need for developing tools to flag COVID-19-related misinformation on the internet, specifically on social media such as Twitter. However, due to novel language and the rapid change of information, existing misinformation detection datasets are not effective for evaluating systems designed to detect misinformation on this topic. Misinformation detection can be divided into two sub-tasks: (i) retrieval of misconceptions relevant to posts being checked for veracity, and (ii) stance detection to identify whether the posts Agree, Disagree, or express No Stance towards the retrieved misconceptions. To facilitate research on this task, we release COVIDLies (https://ucinlp.github.io/covid19more »), a dataset of 6761 expert-annotated tweets to evaluate the performance of misinformation detection systems on 86 different pieces of COVID-19 related misinformation. We evaluate existing NLP systems on this dataset, providing initial benchmarks and identifying key challenges for future models to improve upon.« less
  4. As the internet and social media continue to become increasingly used for sharing break- ing news and important updates, it is with great motivation to study the behaviors of online users during crisis events. One of the biggest issues with obtaining information online is the veracity of such content. Given this vulnerability, misinformation becomes a very danger- ous and real threat when spread online. This study investigates misinformation debunking efforts and fills the research gap on cross-platform information sharing when misinforma- tion is spread during disasters. The false rumor “immigration status is checked at shelters” spread in both Hurricane Harveymore »and Hurricane Irma in 2017 and was analyzed in this paper based on a collection of 12,900 tweets. By studying the rumor control efforts made by thousands of accounts, we found that Twitter users respond and interact the most with tweets from verified Twitter accounts, and especially government organizations. Results on sourcing analysis show that the majority of Twitter users who utilize URLs in their post- ings are employing the information in the URLs to help debunk the false rumor. The most frequently cited information comes from news agencies when analyzing both URLs and domains. This paper provides novel insights into rumor control efforts made through social media during natural disasters and also the information sourcing and sharing behaviors that users exhibit during the debunking of false rumors.« less
  5. Struggling to curb misinformation, social media platforms are experimenting with design interventions to enhance consumption of credible news on their platforms. Some of these interventions, such as the use of warning messages, are examples of nudges---a choice-preserving technique to steer behavior. Despite their application, we do not know whether nudges could steer people into making conscious news credibility judgments online and if they do, under what constraints. To answer, we combine nudge techniques with heuristic based information processing to design NudgeCred--a browser extension for Twitter. NudgeCred directs users' attention to two design cues: authority of a source and other users'more »collective opinion on a report by activating three design nudges---Reliable, Questionable, and Unreliable, each denoting particular levels of credibility for news tweets. In a controlled experiment, we found that NudgeCred significantly helped users (n=430) distinguish news tweets' credibility, unrestricted by three behavioral confounds---political ideology, political cynicism, and media skepticism. A five-day field deployment with twelve participants revealed that NudgeCred improved their recognition of news items and attention towards all of our nudges, particularly towards Questionable. Among other considerations, participants proposed that designers should incorporate heuristics that users' would trust. Our work informs nudge-based system design approaches for online media.« less