skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 5:00 PM ET until 11:00 PM ET on Friday, June 21 due to maintenance. We apologize for the inconvenience.

Title: Mining Social Media Data for Biomedical Signals and Health-Related Behavior
Social media data have been increasingly used to study biomedical and health-related phenomena. From cohort-level discussions of a condition to population-level analyses of sentiment, social media have provided scientists with unprecedented amounts of data to study human behavior associated with a variety of health conditions and medical treatments. Here we review recent work in mining social media for biomedical, epidemiological, and social phenomena information relevant to the multilevel complexity of human health. We pay particular attention to topics where social media data analysis has shown the most progress, including pharmacovigilance and sentiment analysis, especially for mental health. We also discuss a variety of innovative uses of social media data for health-related applications as well as important limitations of social media data access and use.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Annual Review of Biomedical Data Science
Page Range / eLocation ID:
433 to 458
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Research shows that certain external factors can affect the mental health of many people in a community. Moreover, the importance of mental health has significantly increased in recent years due to the COVID-19 pandemic. Many people communicate and express their emotions through social media platforms, which provide researchers with opportunities to examine insights into their opinions and mental state. While social sensing studies using social media data have flourished in the last decade, many studies using social media data to detect and predict mental health status have focused on the individual level. In this study, we aim to generate a social sensing index for mental health to monitor emotional well-being, which is closely related to mental health, and to identify daily trends in negative emotions at the city level. We conduct sentiment analysis on Twitter data and compute entropy of the degree of sentiment change to develop the index. We observe sentiment trends fluctuate significantly in response to unusual events. It is found that the social sensing index for mental health reflects both city-wide and local events that trigger negative emotions, as well as areas where negative emotions persist. The study contributes to the growing body of research that uses social media data to examine mental health at a city-level. We focus on mental health at the city-level rather than individual, which provides a broader perspective on the mental health of a population. Social sensing index for mental health allows public health professionals to monitor and identify persistent negative sentiments and potential areas where mental health issues may emerge.

    more » « less
  2. COVID-19 resulted in health and logistical challenges for many sectors of the American economy, including the trucking industry. This study examined how the pandemic impacted the trucking industry, focused on the pandemic’s impacts on company operations, health, and stress of trucking industry employees. Data were collected from three sources: surveys, focus groups, and social media posts. Individuals at multiple organizational levels of trucking companies (i.e., supervisors, upper-level management, and drivers) completed an online survey and participated in online focus groups. Data from focus groups were coded using a thematic analysis approach. Publicly available social media posts from Twitter were analyzed using a sentiment analysis framework to assess changes in public sentiment about the trucking industry pre- and during-COVID-19. Two themes emerged from the focus groups: (1) trucking company business strategies and adaptations and (2) truck driver experiences and workplace safety. Participants reported supply chain disruptions and new consumer buying trends as having larger industry-wide impacts. Company adaptability emerged due to freight variability, leading organizations to pivot business models and create solutions to reduce operational costs. Companies responded to COVID-19 by accommodating employees’ concerns and implementing safety measures. Truck drivers noted an increase in positive public perception of truck drivers, but job quality factors worsened due to closed amenities and decreased social interaction. Social media sentiment analysis also illustrated an increase in positive public sentiment towards the trucking industry during COVID-19. The pandemic resulted in multi-level economic, health, and social impacts on the trucking industry, which included economic impacts on companies and economic, social and health impacts on employees within the industry levels. Further research can expand on this study to provide an understanding of the long-term impacts of the pandemic on the trucking industry companies within the industry and segments of the trucking industry workforce. 
    more » « less
  3. Social media platforms provide users with various ways of interacting with each other, such as commenting, reacting to posts, sharing content, and uploading pictures. Facebook is one of the most popular platforms, and its users frequently share and reshare posts, including research articles. Moreover, the reactions feature on Facebook allows users to express their feelings towards the content they view, providing valuable data for analysis. This study aims to predict the emotional impact of Facebook posts relating to research articles. We collected data on Facebook posts related to various scientific research domains, including Health Sciences, Social Sciences, Dentistry, Arts, and Humanities. We observed Facebook users’ reactions towards research articles and posts and found that ‘Like’ reactions were the most common. We also noticed that research articles from the Dentistry research domain received a lot of ‘Haha’ reactions. We used machine learning models to predict the sentiment of Facebook posts related to research articles. We used features such as the research article’s title sentiment, abstract sentiment, abstract length, author count, and research domain to build the models. We used five classifiers: Random Forest, Decision Tree, K-Nearest Neighbors, Logistic Regression, and Naïve Bayes. The models were evaluated using accuracy, precision, recall, and F-1 score metrics. The Random Forest classifier was the best model for two- and three-class labels, achieving accuracy measures of 86% and 66%, respectively. We also evaluated the feature importance for the Random Forest model and found that the sentiment of the research article’s title is crucial in predicting the sentiment of the Facebook post. This study has substantial implications for public engagement in science-related messages. The emotional reactions of Facebook users towards research articles and posts can provide valuable insights into public engagement in science, and predicting the emotional impact of Facebook posts related to research articles can help researchers understand how the public perceives scientific research. The findings of the study can aid researchers in effectively communicating their research and engaging the public in scientific discourse. 
    more » « less
  4. Background As a number of vaccines for COVID-19 are given emergency use authorization by local health agencies and are being administered in multiple countries, it is crucial to gain public trust in these vaccines to ensure herd immunity through vaccination. One way to gauge public sentiment regarding vaccines for the goal of increasing vaccination rates is by analyzing social media such as Twitter. Objective The goal of this research was to understand public sentiment toward COVID-19 vaccines by analyzing discussions about the vaccines on social media for a period of 60 days when the vaccines were started in the United States. Using the combination of topic detection and sentiment analysis, we identified different types of concerns regarding vaccines that were expressed by different groups of the public on social media. Methods To better understand public sentiment, we collected tweets for exactly 60 days starting from December 16, 2020 that contained hashtags or keywords related to COVID-19 vaccines. We detected and analyzed different topics of discussion of these tweets as well as their emotional content. Vaccine topics were identified by nonnegative matrix factorization, and emotional content was identified using the Valence Aware Dictionary and sEntiment Reasoner sentiment analysis library as well as by using sentence bidirectional encoder representations from transformer embeddings and comparing the embedding to different emotions using cosine similarity. Results After removing all duplicates and retweets, 7,948,886 tweets were collected during the 60-day time period. Topic modeling resulted in 50 topics; of those, we selected 12 topics with the highest volume of tweets for analysis. Administration and access to vaccines were some of the major concerns of the public. Additionally, we classified the tweets in each topic into 1 of the 5 emotions and found fear to be the leading emotion in the tweets, followed by joy. Conclusions This research focused not only on negative emotions that may have led to vaccine hesitancy but also on positive emotions toward the vaccine. By identifying both positive and negative emotions, we were able to identify the public's response to the vaccines overall and to news events related to the vaccines. These results are useful for developing plans for disseminating authoritative health information and for better communication to build understanding and trust. 
    more » « less
  5. Public sentiment toward the COVID-19 vaccine as expressed on social media can interfere with communication by public health agencies on the importance of getting vaccinated. We investigated Twitter data to understand differences in sentiment, moral values, and language use between political ideologies on the COVID-19 vaccine. We estimated political ideology, conducted a sentiment analysis, and guided by the tenets of moral foundations theory (MFT), we analyzed 262,267 English language tweets from the United States containing COVID-19 vaccine-related keywords between May 2020 and October 2021. We applied the Moral Foundations Dictionary and used topic modeling and Word2Vec to understand moral values and the context of words central to the discussion of the vaccine debate. A quadratic trend showed that extreme ideologies of both Liberals and Conservatives expressed a higher negative sentiment than Moderates, with Conservatives expressing more negative sentiment than Liberals. Compared to Conservative tweets, we found the expression of Liberal tweets to be rooted in a wider set of moral values, associated with moral foundations of care (getting the vaccine for protection), fairness (having access to the vaccine), liberty (related to the vaccine mandate), and authority (trusting the vaccine mandate imposed by the government). Conservative tweets were found to be associated with harm (around safety of the vaccine) and oppression (around the government mandate). Furthermore, political ideology was associated with the expression of different meanings for the same words, e.g. “science” and “death.” Our results inform public health outreach communication strategies to best tailor vaccine information to different groups. 
    more » « less