Social media data have been used to improve geographic situation awareness in the past decade. Although they have free and openly availability advantages, only a small proportion is related to situation awareness, and reliability or trustworthiness is a challenge. A credibility framework is proposed for Twitter data in the context of disaster situation awareness. The framework is derived from crowdsourcing, which states that errors propagated in volunteered information decrease as the number of contributors increases. In the proposed framework, credibility is hierarchically assessed on two tweet levels. The framework was tested using Hurricane Harvey Twitter data, in which situation awareness related tweets were extracted using a set of predefined keywords including power, shelter, damage, casualty, and flood. For each tweet, text messages and associated URLs were integrated to enhance the information completeness. Events were identified by aggregating tweets based on their topics and spatiotemporal characteristics. Credibility for events was calculated and analyzed against the spatial, temporal, and social impacting scales. This framework has the potential to calculate the evolving credibility in real time, providing users insight on the most important and trustworthy events.
more »
« less
A System Analytics Framework for Detecting Infrastructure-Related Topics in Disasters Using Social Sensing
The objective of this paper is to propose and test a system analytics framework based on social sensing and text mining to detect topic evolution associated with the performance of infrastructure systems in disasters. Social media, like Twitter, as active channels of communication and information dissemination, provide insights into real-time information and first-hand experience from affected areas in mass emergencies. While the existing studies show the importance of social sensing in improving situational awareness and emergency response in disasters, the use of social sensing for detection and analysis of infrastructure systems and their resilience performance has been rather limited. This limitation is due to the lack of frameworks to model the events and topics (e.g., grid interruption and road closure) evolution associated with infrastructure systems (e.g., power, highway, airport, and oil) in times of disasters. The proposed framework detects infrastructure-related topics of the tweets posted in disasters and their evolutions by integrating searching relevant keywords, text lemmatization, Part-of-Speech (POS) tagging, TF-IDF vectorization, topic modeling by using Latent Dirichlet Allocation (LDA), and K-Means clustering. The application of the proposed framework was demonstrated in a study of infrastructure systems in Houston during Hurricane Harvey. In this case study, more than sixty thousand tweets were retrieved from 150-mile radius in Houston over 39 days. The analysis of topic detection and evolution from user-generated data were conducted, and the clusters of tweets pertaining to certain topics were mapped in networks over time. The results show that the proposed framework enables to summarize topics and track the movement of situations in different disaster phases. The analytics elements of the proposed framework can improve the recognition of infrastructure performance through text-based representation and provide evidence for decision-makers to take actionable measurements.
more »
« less
- Award ID(s):
- 1759537
- PAR ID:
- 10075880
- Date Published:
- Journal Name:
- 25th International Workshop on Intelligent Computing in Engineering (EG-ICE)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Timely and reliable sensing of infrastructure conditions is critical in disaster management for planning effective infrastructure restorations. Social media, a near real-time information source, has been widely used in disasters for forming timely situational awareness. Yet, using social media to sense electricity infrastructure conditions has not been explored. This study aims to address the research gap through mining public topics from social media. To achieve this purpose, we proposed a systematic and customized approach wherein (1) electricity-related social media data is extracted by the classifier developed based on Bidirectional Encoder Representations from Transformers (BERT); and (2) public topics are modeled with unigrams, bigrams, and trigrams to incorporate the formulaic expressions of infrastructure conditions in social media. Electricity infrastructures in Florida impacted by Hurricane Irma are studied for illustration and demonstration. Results show that the proposed approach is capable of sensing the temporal evolutions and geographic differences of electricity infrastructure conditions.more » « less
-
Background As a number of vaccines for COVID-19 are given emergency use authorization by local health agencies and are being administered in multiple countries, it is crucial to gain public trust in these vaccines to ensure herd immunity through vaccination. One way to gauge public sentiment regarding vaccines for the goal of increasing vaccination rates is by analyzing social media such as Twitter. Objective The goal of this research was to understand public sentiment toward COVID-19 vaccines by analyzing discussions about the vaccines on social media for a period of 60 days when the vaccines were started in the United States. Using the combination of topic detection and sentiment analysis, we identified different types of concerns regarding vaccines that were expressed by different groups of the public on social media. Methods To better understand public sentiment, we collected tweets for exactly 60 days starting from December 16, 2020 that contained hashtags or keywords related to COVID-19 vaccines. We detected and analyzed different topics of discussion of these tweets as well as their emotional content. Vaccine topics were identified by nonnegative matrix factorization, and emotional content was identified using the Valence Aware Dictionary and sEntiment Reasoner sentiment analysis library as well as by using sentence bidirectional encoder representations from transformer embeddings and comparing the embedding to different emotions using cosine similarity. Results After removing all duplicates and retweets, 7,948,886 tweets were collected during the 60-day time period. Topic modeling resulted in 50 topics; of those, we selected 12 topics with the highest volume of tweets for analysis. Administration and access to vaccines were some of the major concerns of the public. Additionally, we classified the tweets in each topic into 1 of the 5 emotions and found fear to be the leading emotion in the tweets, followed by joy. Conclusions This research focused not only on negative emotions that may have led to vaccine hesitancy but also on positive emotions toward the vaccine. By identifying both positive and negative emotions, we were able to identify the public's response to the vaccines overall and to news events related to the vaccines. These results are useful for developing plans for disseminating authoritative health information and for better communication to build understanding and trust.more » « less
-
Climate change has led to a variety of disasters that have caused damage to infrastructure and the economy with societal impacts to human living. Understanding people’s emotions and stressors during disaster times will enable preparation strategies for mitigating further consequences. In this paper, we mine emotions and stressors encountered by people and shared on Twitter during Hurricane Harvey in 2017 as a showcase. In this work, we acquired a dataset of tweets from Twitter on Hurricane Harvey from 20 August 2017 to 30 August 2017. The dataset consists of around 400,000 tweets and is available on Kaggle. Next, a BERT-based model is employed to predict emotions associated with tweets posted by users. Then, natural language processing (NLP) techniques are utilized on negative-emotion tweets to explore the trends and prevalence of the topics discussed during the disaster event. Using Latent Dirichlet Allocation (LDA) topic modeling, we identified themes, enabling us to manually extract stressors termed as climate-change-related stressors. Results show that 20 climate-change-related stressors were extracted and that emotions peaked during the deadliest phase of the disaster. This indicates that tracking emotions may be a useful approach for studying environmentally determined well-being outcomes in light of understanding climate change impacts.more » « less
-
null (Ed.)During COVID-19, misinformation on social media affects the adoption of appropriate prevention behaviors. It is urgent to suppress the misinformation to prevent negative public health consequences. Although an array of studies has proposed misinformation suppression strategies, few have investigated the role of predominant credible information during crises. None has examined its effect quantitatively using longitudinal social media data. Therefore, this research investigates the temporal correlations between credible information and misinformation, and whether predominant credible information can suppress misinformation for two prevention measures (i.e. topics), i.e. wearing masks and social distancing using tweets collected from February 15 to June 30, 2020. We trained Support Vector Machine classifiers to retrieve relevant tweets and classify tweets containing credible information and misinformation for each topic. Based on cross-correlation analyses of credible and misinformation time series for both topics, we find that the previously predominant credible information can lead to the decrease of misinformation (i.e. suppression) with a time lag. The research findings provide empirical evidence for suppressing misinformation with credible information in complex online environments and suggest practical strategies for future information management during crises and emergencies.more » « less
An official website of the United States government

