skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Drug Abuse Ontology to Harness Web-Based Data for Substance Use Epidemiology Research: Ontology Development Study
Background Web-based resources and social media platforms play an increasingly important role in health-related knowledge and experience sharing. There is a growing interest in the use of these novel data sources for epidemiological surveillance of substance use behaviors and trends. Objective The key aims were to describe the development and application of the drug abuse ontology (DAO) as a framework for analyzing web-based and social media data to inform public health and substance use research in the following areas: determining user knowledge, attitudes, and behaviors related to nonmedical use of buprenorphine and illicitly manufactured opioids through the analysis of web forum data Prescription Drug Abuse Online Surveillance; analyzing patterns and trends of cannabis product use in the context of evolving cannabis legalization policies in the United States through analysis of Twitter and web forum data (eDrugTrends); assessing trends in the availability of novel synthetic opioids through the analysis of cryptomarket data (eDarkTrends); and analyzing COVID-19 pandemic trends in social media data related to 13 states in the United States as per Mental Health America reports. Methods The domain and scope of the DAO were defined using competency questions from popular ontology methodology (101 ontology development). The 101 method includes determining the domain and scope of ontology, reusing existing knowledge, enumerating important terms in ontology, defining the classes, their properties and creating instances of the classes. The quality of the ontology was evaluated using a set of tools and best practices recognized by the semantic web community and the artificial intelligence community that engage in natural language processing. Results The current version of the DAO comprises 315 classes, 31 relationships, and 814 instances among the classes. The ontology is flexible and can easily accommodate new concepts. The integration of the ontology with machine learning algorithms dramatically decreased the false alarm rate by adding external knowledge to the machine learning process. The ontology is recurrently updated to capture evolving concepts in different contexts and applied to analyze data related to social media and dark web marketplaces. Conclusions The DAO provides a powerful framework and a useful resource that can be expanded and adapted to a wide range of substance use and mental health domains to help advance big data analytics of web-based data for substance use epidemiology research.  more » « less
Award ID(s):
1761931 1956009
PAR ID:
10404540
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
JMIR Public Health and Surveillance
Volume:
8
Issue:
12
ISSN:
2369-2960
Page Range / eLocation ID:
e24938
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Wren, Jonathan (Ed.)
    Abstract Motivation Substance abuse constitutes one of the major contemporary health epidemics. Recently, the use of social media platforms has garnered interest as a novel source of data for drug addiction epidemiology. Often however, the language used in such forums comprises slang and jargon. Currently, there are no publicly available resources to automatically analyse the esoteric language-use in the social media drug-use sub-culture. This lacunae introduces critical challenges for interpreting, sensemaking and modeling of addiction epidemiology using social media. Results Drug-Use Insights (DUI) is a public and open-source web application to address the aforementioned deficiency. DUI is underlined by a hierarchical taxonomy encompassing 108 different addiction related categories consisting of over 9,000 terms, where each category encompasses a set of semantically related terms. These categories and terms were established by utilizing thematic analysis in conjunction with term embeddings generated from 7,472,545 Reddit posts made by 1,402,017 redditors. Given post(s) from social media forums such as Reddit and Twitter, DUI can be used foremost to identify constituent terms related to drug use. Furthermore, the DUI categories and integrated visualization tools can be leveraged for semantic- and exploratory analysis. To the best of our knowledge, DUI utilizes the largest number of substance use and recovery social media posts used in a study and represents the first significant online taxonomy of drug abuse terminology. Availability The DUI web server and source code are available at: http://haddock9.sfsu.edu/insight/ Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  2. The way media portray public health problems influences the public’s perception of problems and related solutions. Social media allows users to engage with news and to collectively construct meaning. This paper examined news in comparison to user-generated content related to opioids to understand the role of second-level agenda-setting in public health. We analyzed 162,760 tweets about the opioid crisis, and compared the main topics and their sentiments with 2998 opioid stories from The New York Times online. Evidence from this study suggests that second-level agenda setting on social media is different from the news; public communication about opioids on X/Twitter highlights attributes that are different from those highlighted in the news. The findings suggest that public health communication should strategically utilize social media data, including obtaining consumer insight from personal tweets, listening to diverse views and warning signs from issue tweets, and tuning in to the media for policy trends. 
    more » « less
  3. The outbreak of the novel coronavirus, COVID-19, has become one of the most severe pandemics in human history. In this paper, we propose to leverage social media users as social sensors to simultaneously predict the pandemic trends and suggest potential risk factors for public health experts to understand spread situations and recommend proper interventions. More precisely, we develop novel deep learning models to recognize important entities and their relations over time, thereby establishing dynamic heterogeneous graphs to describe the observations of social media users. A dynamic graph neural network model can then forecast the trends (e.g. newly diagnosed cases and death rates) and identify high-risk events from social media. Based on the proposed computational method, we also develop a web-based system for domain experts without any computer science background to easily interact with. We conduct extensive experiments on large-scale datasets of COVID-19 related tweets provided by Twitter, which show that our method can precisely predict the new cases and death rates. We also demonstrate the robustness of our web-based pandemic surveillance system and its ability to retrieve essential knowledge and derive accurate predictions across a variety of circumstances. Our system is also available at http://scaiweb.cs.ucla.edu/covidsurveiller/ . This article is part of the theme issue ‘Data science approachs to infectious disease surveillance’. 
    more » « less
  4. null (Ed.)
    Background: Addiction to drugs and alcohol constitutes one of the significant factors underlying the decline in life expectancy in the US. Several context-specific reasons influence drug use and recovery. In particular emotional distress, physical pain, relationships, and self-development efforts are known to be some of the factors associated with addiction recovery. Unfortunately, many of these factors are not directly observable and quantifying, and assessing their impact can be difficult. Based on social media posts of users engaged in substance use and recovery on the forum Reddit, we employed two psycholinguistic tools, Linguistic Inquiry and Word Count and Empath and activities of substance users on various Reddit sub-forums to analyze behavior underlining addiction recovery and relapse. We then employed a statistical analysis technique called structural equation modeling to assess the effects of these latent factors on recovery and relapse. Results: We found that both emotional distress and physical pain significantly influence addiction recovery behavior. Self-development activities and social relationships of the substance users were also found to enable recovery. Furthermore, within the context of self-development activities, those that were related to influencing the mental and physical well-being of substance users were found to be positively associated with addiction recovery. We also determined that lack of social activities and physical exercise can enable a relapse. Moreover, geography, especially life in rural areas, appears to have a greater correlation with addiction relapse. Conclusions: The paper describes how observable variables can be extracted from social media and then be used to model important latent constructs that impact addiction recovery and relapse. We also report factors that impact self-induced addiction recovery and relapse. To the best of our knowledge, this paper represents the first use of structural equation modeling of social media data with the goal of analyzing factors influencing addiction recovery. 
    more » « less
  5. null (Ed.)
    Abstract In 2016, more than 11 million Americans abused prescription opioids. The National Institute on Drug Abuse considers the opioid crisis a national addiction epidemic, as an increasing number of people are affected each year. Using the framework developed in mathematical modelling of infectious diseases, we create and analyse a compartmental opioid-abuse model consisting of a system of ordinary differential equations. Since $$40\%$$ of opioid overdoses are caused by prescription opioids, our model includes prescription compartments for the four most commonly prescribed opioids, as well as for the susceptible, addicted and recovered populations. While existing research has focused on drug abuse models in general and opioid models with one prescription compartment, no previous work has been done comparing the roles that the most commonly prescribed opioids have had on the crisis. By combining data from the Substance Abuse and Mental Health Services Administration (which tracked the proportion of people who used or misused one of the four individual opioids) with data from the Centers of Disease Control and Prevention (which counted the total number of prescriptions), we estimate prescription rates and probabilities of addiction for the four most commonly prescribed opioids. Additionally, we perform a sensitivity analysis and reallocate prescriptions to determine which opioid has the largest impact on the epidemic. Our results indicate that oxycodone prescriptions are both the most likely to lead to addiction and have the largest impact on the size of the epidemic, while hydrocodone prescriptions had the smallest impact. 
    more » « less