- NSF-PAR ID:
- 10404540
- Date Published:
- Journal Name:
- JMIR Public Health and Surveillance
- Volume:
- 8
- Issue:
- 12
- ISSN:
- 2369-2960
- Page Range / eLocation ID:
- e24938
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Wren, Jonathan (Ed.)Abstract Motivation Substance abuse constitutes one of the major contemporary health epidemics. Recently, the use of social media platforms has garnered interest as a novel source of data for drug addiction epidemiology. Often however, the language used in such forums comprises slang and jargon. Currently, there are no publicly available resources to automatically analyse the esoteric language-use in the social media drug-use sub-culture. This lacunae introduces critical challenges for interpreting, sensemaking and modeling of addiction epidemiology using social media. Results Drug-Use Insights (DUI) is a public and open-source web application to address the aforementioned deficiency. DUI is underlined by a hierarchical taxonomy encompassing 108 different addiction related categories consisting of over 9,000 terms, where each category encompasses a set of semantically related terms. These categories and terms were established by utilizing thematic analysis in conjunction with term embeddings generated from 7,472,545 Reddit posts made by 1,402,017 redditors. Given post(s) from social media forums such as Reddit and Twitter, DUI can be used foremost to identify constituent terms related to drug use. Furthermore, the DUI categories and integrated visualization tools can be leveraged for semantic- and exploratory analysis. To the best of our knowledge, DUI utilizes the largest number of substance use and recovery social media posts used in a study and represents the first significant online taxonomy of drug abuse terminology. Availability The DUI web server and source code are available at: http://haddock9.sfsu.edu/insight/ Supplementary information Supplementary data are available at Bioinformatics online.more » « less
-
null (Ed.)Background: Addiction to drugs and alcohol constitutes one of the significant factors underlying the decline in life expectancy in the US. Several context-specific reasons influence drug use and recovery. In particular emotional distress, physical pain, relationships, and self-development efforts are known to be some of the factors associated with addiction recovery. Unfortunately, many of these factors are not directly observable and quantifying, and assessing their impact can be difficult. Based on social media posts of users engaged in substance use and recovery on the forum Reddit, we employed two psycholinguistic tools, Linguistic Inquiry and Word Count and Empath and activities of substance users on various Reddit sub-forums to analyze behavior underlining addiction recovery and relapse. We then employed a statistical analysis technique called structural equation modeling to assess the effects of these latent factors on recovery and relapse. Results: We found that both emotional distress and physical pain significantly influence addiction recovery behavior. Self-development activities and social relationships of the substance users were also found to enable recovery. Furthermore, within the context of self-development activities, those that were related to influencing the mental and physical well-being of substance users were found to be positively associated with addiction recovery. We also determined that lack of social activities and physical exercise can enable a relapse. Moreover, geography, especially life in rural areas, appears to have a greater correlation with addiction relapse. Conclusions: The paper describes how observable variables can be extracted from social media and then be used to model important latent constructs that impact addiction recovery and relapse. We also report factors that impact self-induced addiction recovery and relapse. To the best of our knowledge, this paper represents the first use of structural equation modeling of social media data with the goal of analyzing factors influencing addiction recovery.more » « less
-
Background The COVID-19 pandemic has resulted in heightened levels of depression, anxiety, and other mental health issues due to sudden changes in daily life, such as economic stress, social isolation, and educational irregularity. Accurately assessing emotional and behavioral changes in response to the pandemic can be challenging, but it is essential to understand the evolving emotions, themes, and discussions surrounding the impact of COVID-19 on mental health.
Objective This study aims to understand the evolving emotions and themes associated with the impact of COVID-19 on mental health support groups (eg, r/Depression and r/Anxiety) on Reddit (Reddit Inc) during the initial phase and after the peak of the pandemic using natural language processing techniques and statistical methods.
Methods This study used data from the r/Depression and r/Anxiety Reddit communities, which consisted of posts contributed by 351,409 distinct users over a period spanning from 2019 to 2022. Topic modeling and Word2Vec embedding models were used to identify key terms associated with the targeted themes within the data set. A range of trend and thematic analysis techniques, including time-to-event analysis, heat map analysis, factor analysis, regression analysis, and k-means clustering analysis, were used to analyze the data.
Results The time-to-event analysis revealed that the first 28 days following a major event could be considered a critical window for mental health concerns to become more prominent. The theme trend analysis revealed key themes such as economic stress, social stress, suicide, and substance use, with varying trends and impacts in each community. The factor analysis highlighted pandemic-related stress, economic concerns, and social factors as primary themes during the analyzed period. Regression analysis showed that economic stress consistently demonstrated the strongest association with the suicide theme, whereas the substance theme had a notable association in both data sets. Finally, the k-means clustering analysis showed that in r/Depression, the number of posts related to the “depression, anxiety, and medication” cluster decreased after 2020, whereas the “social relationships and friendship” cluster showed a steady decrease. In r/Anxiety, the “general anxiety and feelings of unease” cluster peaked in April 2020 and remained high, whereas the “physical symptoms of anxiety” cluster showed a slight increase.
Conclusions This study sheds light on the impact of COVID-19 on mental health and the related themes discussed in 2 web-based communities during the pandemic. The results offer valuable insights for developing targeted interventions and policies to support individuals and communities in similar crises.
-
Background Internet data can be used to improve infectious disease models. However, the representativeness and individual-level validity of internet-derived measures are largely unexplored as this requires ground truth data for study. Objective This study sought to identify relationships between Web-based behaviors and/or conversation topics and health status using a ground truth, survey-based dataset. Methods This study leveraged a unique dataset of self-reported surveys, microbiological laboratory tests, and social media data from the same individuals toward understanding the validity of individual-level constructs pertaining to influenza-like illness in social media data. Logistic regression models were used to identify illness in Twitter posts using user posting behaviors and topic model features extracted from users’ tweets. Results Of 396 original study participants, only 81 met the inclusion criteria for this study. Of these participants’ tweets, we identified only two instances that were related to health and occurred within 2 weeks (before or after) of a survey indicating symptoms. It was not possible to predict when participants reported symptoms using features derived from topic models (area under the curve [AUC]=0.51; P=.38), though it was possible using behavior features, albeit with a very small effect size (AUC=0.53; P≤.001). Individual symptoms were also generally not predictable either. The study sample and a random sample from Twitter are predictably different on held-out data (AUC=0.67; P≤.001), meaning that the content posted by people who participated in this study was predictably different from that posted by random Twitter users. Individuals in the random sample and the GoViral sample used Twitter with similar frequencies (similar @ mentions, number of tweets, and number of retweets; AUC=0.50; P=.19). Conclusions To our knowledge, this is the first instance of an attempt to use a ground truth dataset to validate infectious disease observations in social media data. The lack of signal, the lack of predictability among behaviors or topics, and the demonstrated volunteer bias in the study population are important findings for the large and growing body of disease surveillance using internet-sourced data.more » « less
-
Lin, Chung-Ying (Ed.)Background University students are increasingly recognized as a vulnerable population, suffering from higher levels of anxiety, depression, substance abuse, and disordered eating compared to the general population. Therefore, when the nature of their educational experience radically changes—such as sheltering in place during the COVID-19 pandemic—the burden on the mental health of this vulnerable population is amplified. The objectives of this study are to 1) identify the array of psychological impacts COVID-19 has on students, 2) develop profiles to characterize students' anticipated levels of psychological impact during the pandemic, and 3) evaluate potential sociodemographic, lifestyle-related, and awareness of people infected with COVID-19 risk factors that could make students more likely to experience these impacts. Methods Cross-sectional data were collected through web-based questionnaires from seven U.S. universities. Representative and convenience sampling was used to invite students to complete the questionnaires in mid-March to early-May 2020, when most coronavirus-related sheltering in place orders were in effect. We received 2,534 completed responses, of which 61% were from women, 79% from non-Hispanic Whites, and 20% from graduate students. Results Exploratory factor analysis on close-ended responses resulted in two latent constructs, which we used to identify profiles of students with latent profile analysis, including high (45% of sample), moderate (40%), and low (14%) levels of psychological impact. Bivariate associations showed students who were women, were non-Hispanic Asian, in fair/poor health, of below-average relative family income, or who knew someone infected with COVID-19 experienced higher levels of psychological impact. Students who were non-Hispanic White, above-average social class, spent at least two hours outside, or less than eight hours on electronic screens were likely to experience lower levels of psychological impact. Multivariate modeling (mixed-effects logistic regression) showed that being a woman, having fair/poor general health status, being 18 to 24 years old, spending 8 or more hours on screens daily, and knowing someone infected predicted higher levels of psychological impact when risk factors were considered simultaneously. Conclusion Inadequate efforts to recognize and address college students’ mental health challenges, especially during a pandemic, could have long-term consequences on their health and education.more » « less