skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, July 12 until 9:00 AM ET on Saturday, July 13 due to maintenance. We apologize for the inconvenience.

Title: Assessing the stability of Tweet corpora for hurricane events over time: A mixed methods approach
When natural disasters occur, various organizations and agencies turn to social media to understand who needs help and how they have been affected. The purpose of this study is twofold: first, to evaluate whether hurricane-related tweets have some consistency over time, and second, whether Twitter-derived content is thematically similar to other private social media data. Through a unique method of using Twitter data gathered from six different hurricanes, alongside private data collected from qualitative interviews conducted in the immediate aftermath of Hurricane Harvey, we hypothesize that there is some level of stability across hurricane-related tweet content over time that could be used for better real-time processing of social media data during natural disasters. We use latent Dirichlet allocation (LDA) to derive topics, and, using Hellinger distance as a metric, find that there is a detectable connection among hurricane topics. By uncovering some persistent thematic areas and topics in disaster-related tweets, we hope these findings can help first responders and government agencies discover urgent content in tweets more quickly and reduce the amount of human intervention needed.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Social media + society
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Social media has been increasingly utilized to spread breaking news and risk communications during disasters of all magnitudes. Unfortunately, due to the unmoderated nature of social media platforms such as Twitter, rumors and misinformation are able to propagate widely. Given this, a surfeit of research has studied false rumor diffusion on Twitter, especially during natural disasters. Within this domain, studies have also focused on the misinformation control efforts from government organizations and other major agencies. A prodigious gap in research exists in studying the monitoring of misinformation on social media platforms in times of disasters and other crisis events. Such studies would offer organizations and agencies new tools and ideologies to monitor misinformation on platforms such as Twitter, and make informed decisions on whether or not to use their resources in order to debunk. In this work, we fill the research gap by developing a machine learning framework to predict the veracity of tweets that are spread during crisis events. The tweets are tracked based on the veracity of their content as either true, false, or neutral. We conduct four separate studies, and the results suggest that our framework is capable of tracking multiple cases of misinformation simultaneously, with scores exceeding 87%. In the case of tracking a single case of misinformation, our framework reaches an score of 83%. We collect and drive the algorithms with 15,952 misinformation‐related tweets from the Boston Marathon bombing (2013), Manchester Arena bombing (2017), Hurricane Harvey (2017), Hurricane Irma (2017), and the Hawaii ballistic missile false alert (2018). This article provides novel insights on how to efficiently monitor misinformation that is spread during disasters.

    more » « less
  2. As the internet and social media continue to become increasingly used for sharing break- ing news and important updates, it is with great motivation to study the behaviors of online users during crisis events. One of the biggest issues with obtaining information online is the veracity of such content. Given this vulnerability, misinformation becomes a very danger- ous and real threat when spread online. This study investigates misinformation debunking efforts and fills the research gap on cross-platform information sharing when misinforma- tion is spread during disasters. The false rumor “immigration status is checked at shelters” spread in both Hurricane Harvey and Hurricane Irma in 2017 and was analyzed in this paper based on a collection of 12,900 tweets. By studying the rumor control efforts made by thousands of accounts, we found that Twitter users respond and interact the most with tweets from verified Twitter accounts, and especially government organizations. Results on sourcing analysis show that the majority of Twitter users who utilize URLs in their post- ings are employing the information in the URLs to help debunk the false rumor. The most frequently cited information comes from news agencies when analyzing both URLs and domains. This paper provides novel insights into rumor control efforts made through social media during natural disasters and also the information sourcing and sharing behaviors that users exhibit during the debunking of false rumors. 
    more » « less
  3. Timely delivery of the right information to the right first responders can help improve the outcomes of their efforts and save lives. With social media communications (Twitter, Facebook, etc.) being increasingly used to send and get information during disasters, forwarding them to the right first responders in a timely manner can be very helpful. We use Natural Language Processing and Machine Learning, to steer the social media posts to the most appropriate first responder.An important goal is to retrieve and deliver only the critical, actionable information to first responders in real-time. We examine the overall pipeline starting from retrieving tweets from the social media platforms, to their classification, and dissemination to first responders.We propose improvements in the area of data retrieval, relevance prediction and prioritizing information sent to the first responders by fusing NLP and ML classification techniques thus improving emergency response. We demonstrate the effectiveness of our proposed approach in retrieving and extracting 37,295 actionable tweets related to the IDA hurricane that occurred in the US in Aug.–Sep, 2021. 
    more » « less
  4. Social media is being increasingly utilized to spread breaking news and updates during disasters of all magnitudes. Unfortunately, due to the unmoderated nature of social media platforms such as Twitter, rumors and misinformation are able to propagate widely. Given this, a surfeit of research has studied rumor diffusion on social media, especially during natural disasters. In many studies, researchers manually code social media data to further analyze the patterns and diffusion dynamics of users and misinformation. This method requires many human hours, and is prone to significant incorrect classifications if the work is not checked over by another individual. In our studies, we fill the research gap by applying seven different machine learning algorithms to automatically classify misinformed Twitter data that is spread during disaster events. Due to the unbalanced nature of the data, three different balancing algorithms are also applied and compared. We collect and drive the classifiers with data from the Manchester Arena bombing (2017), Hurricane Harvey (2017), the Hawaiian incoming missile alert (2018), and the East Coast US tsunami alert (2018). Over 20,000 tweets are classified based on the veracity of their content as either true, false, or neutral, with overall accuracies exceeding 89%. 
    more » « less
  5. The objective of this paper is to propose and test a system analytics framework based on social sensing and text mining to detect topic evolution associated with the performance of infrastructure systems in disasters. Social media, like Twitter, as active channels of communication and information dissemination, provide insights into real-time information and first-hand experience from affected areas in mass emergencies. While the existing studies show the importance of social sensing in improving situational awareness and emergency response in disasters, the use of social sensing for detection and analysis of infrastructure systems and their resilience performance has been rather limited. This limitation is due to the lack of frameworks to model the events and topics (e.g., grid interruption and road closure) evolution associated with infrastructure systems (e.g., power, highway, airport, and oil) in times of disasters. The proposed framework detects infrastructure-related topics of the tweets posted in disasters and their evolutions by integrating searching relevant keywords, text lemmatization, Part-of-Speech (POS) tagging, TF-IDF vectorization, topic modeling by using Latent Dirichlet Allocation (LDA), and K-Means clustering. The application of the proposed framework was demonstrated in a study of infrastructure systems in Houston during Hurricane Harvey. In this case study, more than sixty thousand tweets were retrieved from 150-mile radius in Houston over 39 days. The analysis of topic detection and evolution from user-generated data were conducted, and the clusters of tweets pertaining to certain topics were mapped in networks over time. The results show that the proposed framework enables to summarize topics and track the movement of situations in different disaster phases. The analytics elements of the proposed framework can improve the recognition of infrastructure performance through text-based representation and provide evidence for decision-makers to take actionable measurements. 
    more » « less