skip to main content

Title: A Multi-Algorithm Approach for Classifying Misinformed Twitter Data during Crisis Events
Social media is being increasingly utilized to spread breaking news and updates during disasters of all magnitudes. Unfortunately, due to the unmoderated nature of social media platforms such as Twitter, rumors and misinformation are able to propagate widely. Given this, a surfeit of research has studied rumor diffusion on social media, especially during natural disasters. In many studies, researchers manually code social media data to further analyze the patterns and diffusion dynamics of users and misinformation. This method requires many human hours, and is prone to significant incorrect classifications if the work is not checked over by another individual. In our studies, we fill the research gap by applying seven different machine learning algorithms to automatically classify misinformed Twitter data that is spread during disaster events. Due to the unbalanced nature of the data, three different balancing algorithms are also applied and compared. We collect and drive the classifiers with data from the Manchester Arena bombing (2017), Hurricane Harvey (2017), the Hawaiian incoming missile alert (2018), and the East Coast US tsunami alert (2018). Over 20,000 tweets are classified based on the veracity of their content as either true, false, or neutral, with overall accuracies exceeding 89%.
Authors:
; ;
Award ID(s):
1762807 1760586
Publication Date:
NSF-PAR ID:
10096628
Journal Name:
Proceedings of the 2019 IISE Annual Conference
Sponsoring Org:
National Science Foundation
More Like this
  1. Introduction Social media has created opportunities for children to gather social support online (Blackwell et al., 2016; Gonzales, 2017; Jackson, Bailey, & Foucault Welles, 2018; Khasawneh, Rogers, Bertrand, Madathil, & Gramopadhye, 2019; Ponathil, Agnisarman, Khasawneh, Narasimha, & Madathil, 2017). However, social media also has the potential to expose children and adolescents to undesirable behaviors. Research showed that social media can be used to harass, discriminate (Fritz & Gonzales, 2018), dox (Wood, Rose, & Thompson, 2018), and socially disenfranchise children (Page, Wisniewski, Knijnenburg, & Namara, 2018). Other research proposes that social media use might be correlated to the significant increase inmore »suicide rates and depressive symptoms among children and adolescents in the past ten years (Mitchell, Wells, Priebe, & Ybarra, 2014). Evidence based research suggests that suicidal and unwanted behaviors can be promulgated through social contagion effects, which model, normalize, and reinforce self-harming behavior (Hilton, 2017). These harmful behaviors and social contagion effects may occur more frequently through repetitive exposure and modelling via social media, especially when such content goes “viral” (Hilton, 2017). One example of viral self-harming behavior that has generated significant media attention is the Blue Whale Challenge (BWC). The hearsay about this challenge is that individuals at all ages are persuaded to participate in self-harm and eventually kill themselves (Mukhra, Baryah, Krishan, & Kanchan, 2017). Research is needed specifically concerning BWC ethical concerns, the effects the game may have on teenagers, and potential governmental interventions. To address this gap in the literature, the current study uses qualitative and content analysis research techniques to illustrate the risk of self-harm and suicide contagion through the portrayal of BWC on YouTube and Twitter Posts. The purpose of this study is to analyze the portrayal of BWC on YouTube and Twitter in order to identify the themes that are presented on YouTube and Twitter posts that share and discuss BWC. In addition, we want to explore to what extent are YouTube videos compliant with safe and effective suicide messaging guidelines proposed by the Suicide Prevention Resource Center (SPRC). Method Two social media websites were used to gather the data: 60 videos and 1,112 comments from YouTube and 150 posts from Twitter. The common themes of the YouTube videos, comments on those videos, and the Twitter posts were identified using grounded, thematic content analysis on the collected data (Padgett, 2001). Three codebooks were built, one for each type of data. The data for each site were analyzed, and the common themes were identified. A deductive coding analysis was conducted on the YouTube videos based on the nine SPRC safe and effective messaging guidelines (Suicide Prevention Resource Center, 2006). The analysis explored the number of videos that violated these guidelines and which guidelines were violated the most. The inter-rater reliabilities between the coders ranged from 0.61 – 0.81 based on Cohen’s kappa. Then the coders conducted consensus coding. Results & Findings Three common themes were identified among all the posts in the three social media platforms included in this study. The first theme included posts where social media users were trying to raise awareness and warning parents about this dangerous phenomenon in order to reduce the risk of any potential participation in BWC. This was the most common theme in the videos and posts. Additionally, the posts claimed that there are more than 100 people who have played BWC worldwide and provided detailed description of what each individual did while playing the game. These videos also described the tasks and different names of the game. Only few videos provided recommendations to teenagers who might be playing or thinking of playing the game and fewer videos mentioned that the provided statistics were not confirmed by reliable sources. The second theme included posts of people that either criticized the teenagers who participated in BWC or made fun of them for a couple of reasons: they agreed with the purpose of BWC of “cleaning the society of people with mental issues,” or they misunderstood why teenagers participate in these kind of challenges, such as thinking they mainly participate due to peer pressure or to “show off”. The last theme we identified was that most of these users tend to speak in detail about someone who already participated in BWC. These videos and posts provided information about their demographics and interviews with their parents or acquaintances, who also provide more details about the participant’s personal life. The evaluation of the videos based on the SPRC safe messaging guidelines showed that 37% of the YouTube videos met fewer than 3 of the 9 safe messaging guidelines. Around 50% of them met only 4 to 6 of the guidelines, while the remaining 13% met 7 or more of the guidelines. Discussion This study is the first to systematically investigate the quality, portrayal, and reach of BWC on social media. Based on our findings from the emerging themes and the evaluation of the SPRC safe messaging guidelines we suggest that these videos could contribute to the spread of these deadly challenges (or suicide in general since the game might be a hoax) instead of raising awareness. Our suggestion is parallel with similar studies conducted on the portrait of suicide in traditional media (Fekete & Macsai, 1990; Fekete & Schmidtke, 1995). Most posts on social media romanticized people who have died by following this challenge, and younger vulnerable teens may see the victims as role models, leading them to end their lives in the same way (Fekete & Schmidtke, 1995). The videos presented statistics about the number of suicides believed to be related to this challenge in a way that made suicide seem common (Cialdini, 2003). In addition, the videos presented extensive personal information about the people who have died by suicide while playing the BWC. These videos also provided detailed descriptions of the final task, including pictures of self-harm, material that may encourage vulnerable teens to consider ending their lives and provide them with methods on how to do so (Fekete & Macsai, 1990). On the other hand, these videos both failed to emphasize prevention by highlighting effective treatments for mental health problems and failed to encourage teenagers with mental health problems to seek help and providing information on where to find it. YouTube and Twitter are capable of influencing a large number of teenagers (Khasawneh, Ponathil, Firat Ozkan, & Chalil Madathil, 2018; Pater & Mynatt, 2017). We suggest that it is urgent to monitor social media posts related to BWC and similar self-harm challenges (e.g., the Momo Challenge). Additionally, the SPRC should properly educate social media users, particularly those with more influence (e.g., celebrities) on elements that boost negative contagion effects. While the veracity of these challenges is doubted by some, posting about the challenges in unsafe manners can contribute to contagion regardless of the challlenges’ true nature.« less
  2. As the internet and social media continue to become increasingly used for sharing break- ing news and important updates, it is with great motivation to study the behaviors of online users during crisis events. One of the biggest issues with obtaining information online is the veracity of such content. Given this vulnerability, misinformation becomes a very danger- ous and real threat when spread online. This study investigates misinformation debunking efforts and fills the research gap on cross-platform information sharing when misinforma- tion is spread during disasters. The false rumor “immigration status is checked at shelters” spread in both Hurricane Harveymore »and Hurricane Irma in 2017 and was analyzed in this paper based on a collection of 12,900 tweets. By studying the rumor control efforts made by thousands of accounts, we found that Twitter users respond and interact the most with tweets from verified Twitter accounts, and especially government organizations. Results on sourcing analysis show that the majority of Twitter users who utilize URLs in their post- ings are employing the information in the URLs to help debunk the false rumor. The most frequently cited information comes from news agencies when analyzing both URLs and domains. This paper provides novel insights into rumor control efforts made through social media during natural disasters and also the information sourcing and sharing behaviors that users exhibit during the debunking of false rumors.« less
  3. Social media platforms are accused repeatedly of creating environments in which women are bullied and harassed. We argue that online aggression toward women aims to reinforce traditional feminine norms and stereotypes. In a mixed methods study, we find that this type of aggression on Twitter is common and extensive and that it can spread far beyond the original target. We locate over 2.9 million tweets in one week that contain instances of gendered insults (e.g., “bitch,” “cunt,” “slut,” or “whore”)—averaging 419,000 sexist slurs per day. The vast majority of these tweets are negative in sentiment. We analyze the social networksmore »of the conversations that ensue in several cases and demonstrate how the use of “replies,” “retweets,” and “likes” can further victimize a target. Additionally, we develop a sentiment classifier that we use in a regression analysis to compare the negativity of sexist messages. We find that words in a message that reinforce feminine stereotypes inflate the negative sentiment of tweets to a significant and sizeable degree. These terms include those insulting someone’s appearance (e.g., “ugly”), intellect (e.g., “stupid”), sexual experience (e.g., “promiscuous”), mental stability (e.g., “crazy”), and age (“old”). Messages enforcing beauty norms tend to be particularly negative. In sum, hostile, sexist tweets are strategic in nature. They aim to promote traditional, cultural beliefs about femininity, such as beauty ideals, and they shame victims by accusing them of falling short of these standards. Harassment on social media constitutes an everyday, routine occurrence, with researchers finding 9,764,583 messages referencing bullying on Twitter over the span of two years (Bellmore et al. 2015). In other words, Twitter users post over 13,000 bullying-related messages on a daily basis. Forms of online aggression also carry with them serious, negative consequences. Repeated research documents that bullying victims suffer from a host of deleterious outcomes, such as low self-esteem (Hinduja and Patchin 2010), emotional and psychological distress (Ybarra et al. 2006), and negative emotions (Faris and Felmlee 2014; Juvonen and Gross 2008). Compared to those who have not been attacked, victims also tend to report more incidents of suicide ideation and attempted suicide (Hinduja and Patchin 2010). Several studies document that the targets of cyberbullying are disproportionately women (Backe et al. 2018; Felmlee and Faris 2016; Hinduja and Patchin 2010; Pew Research Center 2017), although there are exceptions depending on definitions and venues. Yet, we know little about the content or pattern of cyber aggression directed toward women in online forums. The purpose of the present research, therefore, is to examine in detail the practice of aggressive messaging that targets women and femininity within the social media venue of Twitter. Using both qualitative and quantitative analyses, we investigate the role of gender norm regulation in these patterns of cyber aggression.« less
  4. The 2030 Global Sustainable Development Agenda of United Nations highlighted the critical importance of understanding the integrated nature between enhancing infrastructure resilience and facilitating social equity. Social equity is defined as equal opportunities provided to different people by infrastructure. It addresses disparities and unequal distribution of goods, services, and amenities. Infrastructure resilience is defined as the ability of infrastructure to withstand, adapt, and quickly recover from disasters. Existing research shows that infrastructure resilience and social equity are closely related to each other. However, there is a lack of research that explicitly understands the complex relationships between infrastructure resilience and socialmore »equity. To address this gap, this study aims to examine such interrelationships using social media data. Social media data is increasingly being used by researchers and proven to be a reliable source of valuable information for understanding human activities and behaviors in a disaster setting. The spatiotemporal distribution of disaster-related messages helps with real-time and quick assessment of the impact of disasters on infrastructure and human society across different regions. Using social media data also offers the advantages of saving time and cost, compared to other traditional data collection methods. As a first step of this study, this paper presents our work on collecting and analyzing the Twitter activities during 2018 Hurricane Michael in disaster-affected counties of Florida Panhandle area. The collected Twitter data was organized based on the geolocations of affected counties and was compared against the infrastructure resilience and social equity data of the affected counties. The results of the analysis indicate that (1) Twitter activities can be used as an important indicator of infrastructure resilience conditions, (2) socially vulnerable populations are not as active as general populations on social media in a disaster setting, and (3) vulnerable populations require a longer time for disaster recovery.« less
  5. The global spread of the novel coronavirus is affected by the spread of related misinformation—the so-called COVID-19 Infodemic—that makes populations more vulnerable to the disease through resistance to mitigation efforts. Here, we analyze the prevalence and diffusion of links to low-credibility content about the pandemic across two major social media platforms, Twitter and Facebook. We characterize cross-platform similarities and differences in popular sources, diffusion patterns, influencers, coordination, and automation. Comparing the two platforms, we find divergence among the prevalence of popular low-credibility sources and suspicious videos. A minority of accounts and pages exert a strong influence on each platform. Thesemore »misinformation “superspreaders” are often associated with the low-credibility sources and tend to be verified by the platforms. On both platforms, there is evidence of coordinated sharing of Infodemic content. The overt nature of this manipulation points to the need for societal-level solutions in addition to mitigation strategies within the platforms. However, we highlight limits imposed by inconsistent data-access policies on our capability to study harmful manipulations of information ecosystems.« less