skip to main content


Title: Social Media as an Alternative to Surveys of Opinions About the Economy
There is interest in using social media content to supplement or even substitute for survey data. In one of the first studies to test the feasibility of this idea, O’Connor, Balasubramanyan, Routledge, and Smith report reasonably high correlations between the sentiment of tweets containing the word “jobs” and survey-based measures of consumer confidence in 2008–2009. Other researchers report a similar relationship through 2011, but after that time it is no longer observed, suggesting such tweets may not be as promising an alternative to survey responses as originally hoped. But, it’s possible that with the right analytic techniques, the sentiment of “jobs” tweets might still be an acceptable alternative. To explore this, we first classify “jobs” tweets into categories whose content is either related to employment or not, to see whether sentiment of the former correlates more highly with a survey-based measure of consumer sentiment. We then compare the relationship when sentiment is determined with traditional dictionary-based methods versus newer machine learning-based tools developed for Twitter-like texts. We calculated daily sentiment in three different ways and used a measure of association less sensitive to outliers than correlation. None of these approaches improved the size of the relationship in the original or more recent data. We found that the many micro-decisions these analyses require, such as the size of the smoothing interval and the length of the lag between the two series, can significantly affect the outcomes. In the end, despite the earlier promise of tweets as an alternative to survey responses, we find no evidence that the original relationship in these data was more than a chance occurrence.  more » « less
Award ID(s):
1646108
PAR ID:
10285325
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Social Science Computer Review
Volume:
39
Issue:
4
ISSN:
0894-4393
Page Range / eLocation ID:
489 to 508
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    An important means for disseminating information in social media platforms is by including URLs that point to external sources in user posts. In Twitter, we estimate that about 21% of the daily stream of English-language tweets contain URLs. We notice that NLP tools make little attempt at understanding the relationship between the content of the URL and the text surrounding it in a tweet. In this work, we study the structure of tweets with URLs relative to the content of the Web documents pointed to by the URLs. We identify several segments classes that may appear in a tweet with URLs, such as the title of a Web page and the user's original content. Our goals in this paper are: introduce, define, and analyze the segmentation problem of tweets with URLs, develop an effective algorithm to solve it, and show that our solution can benefit sentiment analysis on Twitter. We also show that the problem is an instance of the block edit distance problem, and thus an NP-hard problem. 
    more » « less
  2. Background As a number of vaccines for COVID-19 are given emergency use authorization by local health agencies and are being administered in multiple countries, it is crucial to gain public trust in these vaccines to ensure herd immunity through vaccination. One way to gauge public sentiment regarding vaccines for the goal of increasing vaccination rates is by analyzing social media such as Twitter. Objective The goal of this research was to understand public sentiment toward COVID-19 vaccines by analyzing discussions about the vaccines on social media for a period of 60 days when the vaccines were started in the United States. Using the combination of topic detection and sentiment analysis, we identified different types of concerns regarding vaccines that were expressed by different groups of the public on social media. Methods To better understand public sentiment, we collected tweets for exactly 60 days starting from December 16, 2020 that contained hashtags or keywords related to COVID-19 vaccines. We detected and analyzed different topics of discussion of these tweets as well as their emotional content. Vaccine topics were identified by nonnegative matrix factorization, and emotional content was identified using the Valence Aware Dictionary and sEntiment Reasoner sentiment analysis library as well as by using sentence bidirectional encoder representations from transformer embeddings and comparing the embedding to different emotions using cosine similarity. Results After removing all duplicates and retweets, 7,948,886 tweets were collected during the 60-day time period. Topic modeling resulted in 50 topics; of those, we selected 12 topics with the highest volume of tweets for analysis. Administration and access to vaccines were some of the major concerns of the public. Additionally, we classified the tweets in each topic into 1 of the 5 emotions and found fear to be the leading emotion in the tweets, followed by joy. Conclusions This research focused not only on negative emotions that may have led to vaccine hesitancy but also on positive emotions toward the vaccine. By identifying both positive and negative emotions, we were able to identify the public's response to the vaccines overall and to news events related to the vaccines. These results are useful for developing plans for disseminating authoritative health information and for better communication to build understanding and trust. 
    more » « less
  3. In this paper, we propose and apply a method to analyze the activeness of an event based on related tweets. The method characterizes and measures activeness of an event by a set of indicators. The indicators proposed in this paper are original tweet count, retweet count, follower count, positive sentiment, negative sentiment, daily change in users count, and sparseness of user community. We present procedures to compute the last two indicators. All indicators collectively are used to determine the activeness of an event. This approach is used to analyze the Syrian-refugee-crisis-related tweets. Its generality is demonstrated by applying it to analyze “immigration”-related tweets. 
    more » « less
  4. Social media platforms are frequently used to share information and opinions around vaccinations. The more often a message is reshared, the wider the reach of the message and potential influence it may have on shaping people’s opinions to get vaccinated or not. We used a negative binomial regression to investigate whether a message’s linguistic characteristics (degree of concreteness, emotional arousal, and sentiment) and user characteristics (political ideology and number of followers) may influence users’ decisions to reshare tweets related to the COVID-19 vaccine. We analyzed US English-language tweets related to the COVID-19 vaccine between May 2020 and October 2021 (N = 236,054).

    Tweets with positive and high-arousal words were more often retweeted than negative, low-arousal tweets. Tweets with abstract words were more often retweeted than tweets with concrete words. In addition, while Liberal users were more likely to have tweets with a positive sentiment reshared, Conservative users were more likely to have tweets with a negative sentiment reshared. Our results can inform public health messaging on how to best phrase vaccine information to impact engagement and information resharing, and potentially persuade a wider set of people to get vaccinated.

     
    more » « less
  5. Social media platforms are accused repeatedly of creating environments in which women are bullied and harassed. We argue that online aggression toward women aims to reinforce traditional feminine norms and stereotypes. In a mixed methods study, we find that this type of aggression on Twitter is common and extensive and that it can spread far beyond the original target. We locate over 2.9 million tweets in one week that contain instances of gendered insults (e.g., “bitch,” “cunt,” “slut,” or “whore”)—averaging 419,000 sexist slurs per day. The vast majority of these tweets are negative in sentiment. We analyze the social networks of the conversations that ensue in several cases and demonstrate how the use of “replies,” “retweets,” and “likes” can further victimize a target. Additionally, we develop a sentiment classifier that we use in a regression analysis to compare the negativity of sexist messages. We find that words in a message that reinforce feminine stereotypes inflate the negative sentiment of tweets to a significant and sizeable degree. These terms include those insulting someone’s appearance (e.g., “ugly”), intellect (e.g., “stupid”), sexual experience (e.g., “promiscuous”), mental stability (e.g., “crazy”), and age (“old”). Messages enforcing beauty norms tend to be particularly negative. In sum, hostile, sexist tweets are strategic in nature. They aim to promote traditional, cultural beliefs about femininity, such as beauty ideals, and they shame victims by accusing them of falling short of these standards. Harassment on social media constitutes an everyday, routine occurrence, with researchers finding 9,764,583 messages referencing bullying on Twitter over the span of two years (Bellmore et al. 2015). In other words, Twitter users post over 13,000 bullying-related messages on a daily basis. Forms of online aggression also carry with them serious, negative consequences. Repeated research documents that bullying victims suffer from a host of deleterious outcomes, such as low self-esteem (Hinduja and Patchin 2010), emotional and psychological distress (Ybarra et al. 2006), and negative emotions (Faris and Felmlee 2014; Juvonen and Gross 2008). Compared to those who have not been attacked, victims also tend to report more incidents of suicide ideation and attempted suicide (Hinduja and Patchin 2010). Several studies document that the targets of cyberbullying are disproportionately women (Backe et al. 2018; Felmlee and Faris 2016; Hinduja and Patchin 2010; Pew Research Center 2017), although there are exceptions depending on definitions and venues. Yet, we know little about the content or pattern of cyber aggression directed toward women in online forums. The purpose of the present research, therefore, is to examine in detail the practice of aggressive messaging that targets women and femininity within the social media venue of Twitter. Using both qualitative and quantitative analyses, we investigate the role of gender norm regulation in these patterns of cyber aggression. 
    more » « less