skip to main content

Title: Topics and Sentiments of Public Concerns Regarding COVID-19 Vaccines: Social Media Trend Analysis
Background As a number of vaccines for COVID-19 are given emergency use authorization by local health agencies and are being administered in multiple countries, it is crucial to gain public trust in these vaccines to ensure herd immunity through vaccination. One way to gauge public sentiment regarding vaccines for the goal of increasing vaccination rates is by analyzing social media such as Twitter. Objective The goal of this research was to understand public sentiment toward COVID-19 vaccines by analyzing discussions about the vaccines on social media for a period of 60 days when the vaccines were started in the United States. Using the combination of topic detection and sentiment analysis, we identified different types of concerns regarding vaccines that were expressed by different groups of the public on social media. Methods To better understand public sentiment, we collected tweets for exactly 60 days starting from December 16, 2020 that contained hashtags or keywords related to COVID-19 vaccines. We detected and analyzed different topics of discussion of these tweets as well as their emotional content. Vaccine topics were identified by nonnegative matrix factorization, and emotional content was identified using the Valence Aware Dictionary and sEntiment Reasoner sentiment analysis library as well more » as by using sentence bidirectional encoder representations from transformer embeddings and comparing the embedding to different emotions using cosine similarity. Results After removing all duplicates and retweets, 7,948,886 tweets were collected during the 60-day time period. Topic modeling resulted in 50 topics; of those, we selected 12 topics with the highest volume of tweets for analysis. Administration and access to vaccines were some of the major concerns of the public. Additionally, we classified the tweets in each topic into 1 of the 5 emotions and found fear to be the leading emotion in the tweets, followed by joy. Conclusions This research focused not only on negative emotions that may have led to vaccine hesitancy but also on positive emotions toward the vaccine. By identifying both positive and negative emotions, we were able to identify the public's response to the vaccines overall and to news events related to the vaccines. These results are useful for developing plans for disseminating authoritative health information and for better communication to build understanding and trust. « less
; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Journal of Medical Internet Research
Sponsoring Org:
National Science Foundation
More Like this
  1. Risk perception and risk averting behaviors of public agencies in the emergence and spread of COVID-19 can be retrieved through online social media (Twitter), and such interactions can be echoed in other information outlets. This study collected time-sensitive online social media data and analyzed patterns of health risk communication of public health and emergency agencies in the emergence and spread of novel coronavirus using data-driven methods. The major focus is toward understanding how policy-making agencies communicate risk and response information through social media during a pandemic and influence community response—ie, timing of lockdown, timing of reopening, etc.—and disease outbreak indicators—ie,more »number of confirmed cases and number of deaths. Twitter data of six major public organizations (1,000-4,500 tweets per organization) are collected from February 21, 2020 to June 6, 2020. Several machine learning algorithms, including dynamic topic model and sentiment analysis, are applied over time to identify the topic dynamics over the specific timeline of the pandemic. Organizations emphasized on various topics—eg, importance of wearing face mask, home quarantine, understanding the symptoms, social distancing and contact tracing, emerging community transmission, lack of personal protective equipment, COVID-19 testing and medical supplies, effect of tobacco, pandemic stress management, increasing hospitalization rate, upcoming hurricane season, use of convalescent plasma for COVID-19 treatment, maintaining hygiene, and the role of healthcare podcast in different timeline. The findings can benefit emergency management, policymakers, and public health agencies to identify targeted information dissemination policies for public with diverse needs based on how local, federal, and international agencies reacted to COVID-19.« less
  2. Background The COVID-19 pandemic has caused several disruptions in personal and collective lives worldwide. The uncertainties surrounding the pandemic have also led to multifaceted mental health concerns, which can be exacerbated with precautionary measures such as social distancing and self-quarantining, as well as societal impacts such as economic downturn and job loss. Despite noting this as a “mental health tsunami”, the psychological effects of the COVID-19 crisis remain unexplored at scale. Consequently, public health stakeholders are currently limited in identifying ways to provide timely and tailored support during these circumstances. Objective Our study aims to provide insights regarding people’s psychosocialmore »concerns during the COVID-19 pandemic by leveraging social media data. We aim to study the temporal and linguistic changes in symptomatic mental health and support expressions in the pandemic context. Methods We obtained about 60 million Twitter streaming posts originating from the United States from March 24 to May 24, 2020, and compared these with about 40 million posts from a comparable period in 2019 to attribute the effect of COVID-19 on people’s social media self-disclosure. Using these data sets, we studied people’s self-disclosure on social media in terms of symptomatic mental health concerns and expressions of support. We employed transfer learning classifiers that identified the social media language indicative of mental health outcomes (anxiety, depression, stress, and suicidal ideation) and support (emotional and informational support). We then examined the changes in psychosocial expressions over time and language, comparing the 2020 and 2019 data sets. Results We found that all of the examined psychosocial expressions have significantly increased during the COVID-19 crisis—mental health symptomatic expressions have increased by about 14%, and support expressions have increased by about 5%, both thematically related to COVID-19. We also observed a steady decline and eventual plateauing in these expressions during the COVID-19 pandemic, which may have been due to habituation or due to supportive policy measures enacted during this period. Our language analyses highlighted that people express concerns that are specific to and contextually related to the COVID-19 crisis. Conclusions We studied the psychosocial effects of the COVID-19 crisis by using social media data from 2020, finding that people’s mental health symptomatic and support expressions significantly increased during the COVID-19 period as compared to similar data from 2019. However, this effect gradually lessened over time, suggesting that people adapted to the circumstances and their “new normal.” Our linguistic analyses revealed that people expressed mental health concerns regarding personal and professional challenges, health care and precautionary measures, and pandemic-related awareness. This study shows the potential to provide insights to mental health care and stakeholders and policy makers in planning and implementing measures to mitigate mental health risks amid the health crisis.« less
  3. A novel coronavirus emerged in December of 2019 (COVID-19), causing a pandemic that inflicted unprecedented public health and economic burden in all nooks and corners of the world. Although the control of COVID-19 largely focused on the use of basic public health measures (primarily based on using non-pharmaceutical interventions, such as quarantine, isolation, social-distancing, face mask usage, and community lockdowns) initially, three safe and highly-effective vaccines (by AstraZeneca Inc., Moderna Inc., and Pfizer Inc.), were approved for use in humans in December 2020. We present a new mathematical model for assessing the population-level impact of these vaccines on curtailing themore »burden of COVID-19. The model stratifies the total population into two subgroups, based on whether or not they habitually wear face mask in public. The resulting multigroup model, which takes the form of a deterministic system of nonlinear differential equations, is fitted and parameterized using COVID-19 cumulative mortality data for the third wave of the COVID-19 pandemic in the United States. Conditions for the asymptotic stability of the associated disease-free equilibrium, as well as an expression for the vaccine-derived herd immunity threshold, are rigorously derived. Numerical simulations of the model show that the size of the initial proportion of individuals in the mask-wearing group, together with positive change in behavior from the non-mask wearing group (as well as those in the mask-wearing group, who do not abandon their mask-wearing habit) play a crucial role in effectively curtailing the COVID-19 pandemic in the United States. This study further shows that the prospect of achieving vaccine-derived herd immunity (required for COVID-19 elimination) in the U.S., using the Pfizer or Moderna vaccine, is quite promising. In particular, our study shows that herd immunity can be achieved in the U.S. if at least 60% of the population are fully vaccinated. Furthermore, the prospect of eliminating the pandemic in the U.S. in the year 2021 is significantly enhanced if the vaccination program is complemented with non-pharmaceutical interventions at moderate increased levels of compliance (in relation to their baseline compliance). The study further suggests that, while the waning of natural and vaccine-derived immunity against COVID-19 induces only a marginal increase in the burden and projected time-to-elimination of the pandemic, adding the impacts of therapeutic benefits of the vaccines into the model resulted in a dramatic reduction in the burden and time-to-elimination of the pandemic.« less
  4. Background Internet data can be used to improve infectious disease models. However, the representativeness and individual-level validity of internet-derived measures are largely unexplored as this requires ground truth data for study. Objective This study sought to identify relationships between Web-based behaviors and/or conversation topics and health status using a ground truth, survey-based dataset. Methods This study leveraged a unique dataset of self-reported surveys, microbiological laboratory tests, and social media data from the same individuals toward understanding the validity of individual-level constructs pertaining to influenza-like illness in social media data. Logistic regression models were used to identify illness in Twitter postsmore »using user posting behaviors and topic model features extracted from users’ tweets. Results Of 396 original study participants, only 81 met the inclusion criteria for this study. Of these participants’ tweets, we identified only two instances that were related to health and occurred within 2 weeks (before or after) of a survey indicating symptoms. It was not possible to predict when participants reported symptoms using features derived from topic models (area under the curve [AUC]=0.51; P=.38), though it was possible using behavior features, albeit with a very small effect size (AUC=0.53; P≤.001). Individual symptoms were also generally not predictable either. The study sample and a random sample from Twitter are predictably different on held-out data (AUC=0.67; P≤.001), meaning that the content posted by people who participated in this study was predictably different from that posted by random Twitter users. Individuals in the random sample and the GoViral sample used Twitter with similar frequencies (similar @ mentions, number of tweets, and number of retweets; AUC=0.50; P=.19). Conclusions To our knowledge, this is the first instance of an attempt to use a ground truth dataset to validate infectious disease observations in social media data. The lack of signal, the lack of predictability among behaviors or topics, and the demonstrated volunteer bias in the study population are important findings for the large and growing body of disease surveillance using internet-sourced data.« less
  5. Abstract Containment measures have been applied throughout the world to halt the COVID-19 pandemic. In the United States, several forms of lockdown have been adopted in different parts of the country, leading to heterogeneous epidemiological, social, and economic effects. Here, we present a spatio-temporal analysis of a Twitter dataset comprising 1.3 million geo-localized Tweets about lockdown, from January to May 2020. Through sentiment analysis, we classified Tweets as expressing positive or negative emotions about lockdown, demonstrating a change in perception during the course of the pandemic modulated by socio-economic factors. A transfer entropy analysis of the time series of Tweetsmore »unveiled that the emotions in different parts of the country did not evolve independently. Rather, they were mediated by spatial interactions, which were also related to socio-ecomomic factors and, arguably, to political orientations. This study constitutes a first, necessary step toward isolating the mechanisms underlying the acceptance of public health interventions from highly resolved online datasets.« less