skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Multimodal Semi-supervised Learning for Disaster Tweet Classification
During natural disasters, people often use social media platforms, such as Twitter, to post information about casualties and damage produced by disasters. This information can help relief authorities gain situational awareness in nearly real time, and enable them to quickly distribute resources where most needed. However, annotating data for this purpose can be burdensome, subjective and expensive. In this paper, we investigate how to leverage the copious amounts of unlabeled data generated on social media by disaster eyewitnesses and affected individuals during disaster events. To this end, we propose a semi-supervised learning approach to improve the performance of neural models on several multimodal disaster tweet classification tasks. Our approach shows significant improvements, obtaining up to 7.7% improvements in F-1 in low-data regimes and 1.9% when using the entire training data. We make our code and data publicly available at https://github.com/iustinsirbu13/multimodal-ssl-for-disaster-tweet-classification.  more » « less
Award ID(s):
1741345
PAR ID:
10388181
Author(s) / Creator(s):
Date Published:
Journal Name:
The 29th International Conference on Computational Linguistics (COLING 2022)
Page Range / eLocation ID:
2711–2723
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Radianti, Jaziar; Dokas, Ioannis; Lalone, Nicolas; Khazanchi, Deepak (Ed.)
    The shared real-time information about natural disasters on social media platforms like Twitter and Facebook plays a critical role in informing volunteers, emergency managers, and response organizations. However, supervised learning models for monitoring disaster events require large amounts of annotated data, making them unrealistic for real-time use in disaster events. To address this challenge, we present a fine-grained disaster tweet classification model under the semi-supervised, few-shot learning setting where only a small number of annotated data is required. Our model, CrisisMatch, effectively classifies tweets into fine-grained classes of interest using few labeled data and large amounts of unlabeled data, mimicking the early stage of a disaster. Through integrating effective semi-supervised learning ideas and incorporating TextMixUp, CrisisMatch achieves performance improvement on two disaster datasets of 11.2% on average. Further analyses are also provided for the influence of the number of labeled data and out-of-domain results. 
    more » « less
  2. Firms’ public communication on social media during disasters can benefit both disaster response efficiency and the perception of the corporate image. Despite its importance, limited guidelines are available to inform firms’ disaster communication strategies. The current study examines firms’ communication on social media in various disasters and how it impacts public engagement. We employ a novel natural language processing (NLP) approach, Semantic Projection with Active Retrieval (SPAR), to analyze Facebook posts made by Russell 3000 firms between 2009 and 2022 concerning various disasters. We show that firm communication can be measured based on two dimensions derived from the Competing Values Framework (CVF): internal versus external and stable versus flexible. We find that social media messages that emphasize operational continuity (internal/stable-oriented) are more popular during biological disasters. By contrast, messages that stress innovations and adaptations to disasters (external/flexible-oriented) elicit more engagement in weather-related disasters. The study offers a framework to characterize and guide firms’ design of disaster communication on social media in different disaster contexts. Our SPAR method is also available to firms to analyze their social media data and uncover the underlying patterns in communication across different contexts. 
    more » « less
  3. Global social media use during natural disasters has been well documented (Murthy et al., 2017). In the U.S., public social media platforms are often a primary venue for those affected by disasters . Some disaster victims believe first responders will see their public posts and that the 9-1-1 telephone system becomes overloaded during crises. Moreover, some feel that the accuracy and utility of information on social media is likely higher than traditional media sources . However, sifting through content during a disaster is often difficult due to the high volume of ‘non-relevant’ content. In addition, text is studied more than images posted on Twitter, leaving a potential gap in understanding disaster experiences. Images posted on social media during disasters have a high level of complexity (Murthy et al., 2016). Our study responds to O’Neal et al.’s (2017) call-to-action that social media images posted during disasters should be studied using machine learning. 
    more » « less
  4. This article seeks to go beyond traditional GIS methods used in creating maps for disaster response that commonly look at the disaster extent. Instead, a slightly different approach is taken using social media data collected from Twitter to explore how people communicate during disaster events, how online communities form and evolve, and how communication methods can improve. This study collected the Twitter data during the 2015 Nepal earthquake disaster and applied a spatiotemporal analysis to find any patterns that show shadows or gaps in communication channels in local communities’ communication. Linkages in social media can be used to understand how people communicate, how quickly they diffuse information, and how social networks form online during disasters. These can improve communication throughout disaster phases. This study offers a deeper understanding of the kinds of spatiotemporal patterns and spatial social networks that can be observed during disaster events. The need for better communication during disaster events is imperative for better disaster management, increasing community resilience, and saving lives. 
    more » « less
  5. null (Ed.)
    During disasters, it is critical to deliver emergency information to appropriate first responders. Name-based information delivery provides efficient, timely dissemination of relevant content to first responder teams assigned to different incident response roles. People increasingly depend on social media for communicating vital information, using free-form text. Thus, a method that delivers these social media posts to the right first responders can significantly improve outcomes. In this paper, we propose FLARE, a framework using 'Social Media Engines' (SMEs) to map social media posts (SMPs), such as tweets, to the right names. SMEs perform natural language processing-based classification and exploit several machine learning capabilities, in an online real-time manner. To reduce the manual labeling effort required for learning during the disaster, we leverage active learning, complemented by dispatchers with specific domain-knowledge performing limited labeling. We also leverage federated learning across various public-safety departments with specialized knowledge to handle notifications related to their roles in a cooperative manner. We implement three different classifiers: for incident relevance, organization, and fine-grained role prediction. Each class is associated with a specific subset of the namespace graph. The novelty of our system is the integration of the namespace with federated active learning and inference procedures to identify and deliver vital SMPs to the right first responders in a distributed multi-organization environment, in real-time. Our experiments using real-world data, including tweets generated by citizens during the wildfires in California in 2018, show our approach outperforming both a simple keyword-based classification and several existing NLP-based classification techniques. 
    more » « less