skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on July 8, 2026

Title: Real-Time Detection of Online Health Misinformation using an Integrated Knowledgegraph-LLM Approach
The dramatic surge of health misinformation on social media platforms poses a significant threat to public health, contributing to hesitancy in vaccines, delayed medical interventions, and the adoption of untested or harmful treatments. We present a novel, hybrid AI-driven framework designed for the real-time detection of health misinformation on social media platforms while prioritizing user privacy. The framework integrates the strengths of Large Language Models (LLMs), such as DistilBERT, with domain-specific Knowledge Graphs (KGs) to enhance the detection of nuanced and contextually dependent misinformation. LLMs excel at understanding the complexities of human language, while KGs provide a structured representation of medical knowledge, allowing factual verification and identification of inconsistencies. Furthermore, the framework incorporates robust privacy-preserving mechanisms, including differential privacy and secure data pipelines, to address user privacy concerns and comply with healthcare data protection regulations. Our experimental results on a dataset of Reddit posts related to chronic health conditions demonstrate the performance of this hybrid approach compared to models that only use text or KG, highlighting the synergistic effect of combining LLMs and KGs for improved misinformation detection.  more » « less
Award ID(s):
2310844
PAR ID:
10629663
Author(s) / Creator(s):
;
Publisher / Repository:
IEEE International Conference on Digital Health (ICDH) , 2025 at IEEE World Congress on Services 2025
Date Published:
Subject(s) / Keyword(s):
Health Misinformation LLMs Knowledge Graphs Digital Health Privacy-Preservation Realtime Misinformation Detection
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The rise of e-commerce and social networking platforms has led to an increase in the disclosure of personal health information within user-generated content. This study investigates the application of large language models (LLMs) to detect and sanitize sensitive health data shared by users across platforms such as Amazon, patient.info, and Facebook. We propose a methodology that leverages LLMs to evaluate both the sensitivity of disclosed information and the platform-specific semantics of the content. Through prompt engineering, our method identifies sensitive information and rephrases it to minimize disclosure while preserving content similarity. ChatGPT serves as the LLM in this study due to its versatility. Empirical results suggest that ChatGPT can reliably assign sensitivity scores to user-generated text and generate sanitized versions that effectively preserve the original meaning. 
    more » « less
  2. Language models have the potential to assess mental health using social media data. By analyzing online posts and conversations, these models can detect patterns indicating mental health conditions like depression, anxiety, or suicidal thoughts. They examine keywords, language markers, and sentiment to gain insights into an individual’s mental well-being. This information is crucial for early detection, intervention, and support, improving mental health care and prevention strategies. However, using language models for mental health assessments from social media has two limitations: (1) They do not compare posts against clinicians’ diagnostic processes, and (2) It’s challenging to explain language model outputs using concepts that the clinician can understand, i.e., clinician-friendly explanations. In this study, we introduce Process Knowledge-infused Learning (PK-iL), a new learning paradigm that layers clinical process knowledge structures on language model outputs, enabling clinician-friendly explanations of the underlying language model predictions. We rigorously test our methods on existing benchmark datasets, augmented with such clinical process knowledge, and release a new dataset for assessing suicidality. PKiL performs competitively, achieving a 70% agreement with users, while other XAI methods only achieve 47% agreement (average inter-rater agreement of 0.72). Our evaluations demonstrate that PK-iL effectively explains model predictions to clinicians. 
    more » « less
  3. Abstract Explainability and Safety engender trust. These require a model to exhibit consistency and reliability. To achieve these, it is necessary to use and analyzedataandknowledgewith statistical and symbolic AI methods relevant to the AI application––neither alone will do. Consequently, we argue and seek to demonstrate that the NeuroSymbolic AI approach is better suited for making AI a trusted AI system. We present the CREST framework that shows howConsistency,Reliability, user‐levelExplainability, andSafety are built on NeuroSymbolic methods that use data and knowledge to support requirements for critical applications such as health and well‐being. This article focuses on Large Language Models (LLMs) as the chosen AI system within the CREST framework. LLMs have garnered substantial attention from researchers due to their versatility in handling a broad array of natural language processing (NLP) scenarios. As examples, ChatGPT and Google's MedPaLM have emerged as highly promising platforms for providing information in general and health‐related queries, respectively. Nevertheless, these models remain black boxes despite incorporating human feedback and instruction‐guided tuning. For instance, ChatGPT can generateunsafe responsesdespite instituting safety guardrails. CREST presents a plausible approach harnessing procedural and graph‐based knowledge within a NeuroSymbolic framework to shed light on the challenges associated with LLMs. 
    more » « less
  4. Abstract Misinformation about the COVID-19 pandemic proliferated widely on social media platforms during the course of the health crisis. Experts have speculated that consuming misinformation online can potentially worsen the mental health of individuals, by causing heightened anxiety, stress, and even suicidal ideation. The present study aims to quantify the causal relationship between sharing misinformation, a strong indicator of consuming misinformation, and experiencing exacerbated anxiety. We conduct a large-scale observational study spanning over 80 million Twitter posts made by 76,985 Twitter users during an 18.5 month period. The results from this study demonstrate that users who shared COVID-19 misinformation experienced approximately two times additional increase in anxiety when compared to similar users who did not share misinformation. Socio-demographic analysis reveals that women, racial minorities, and individuals with lower levels of education in the United States experienced a disproportionately higher increase in anxiety when compared to the other users. These findings shed light on the mental health costs of consuming online misinformation. The work bears practical implications for social media platforms in curbing the adverse psychological impacts of misinformation, while also upholding the ethos of an online public sphere. 
    more » « less
  5. The proliferation of platforms such as e-commerceand social networks has led to an increasing amount of personal health information being disclosed in user-generated content.This study investigates the use of Large Language Models (LLMs) to detect and sanitize sensitive health data disclosures in reviews posted on Amazon. Specifically, we present an approach that uses ChatGPT to evaluate both the sensitivity and informativeness of Amazon reviews. The approach uses prompt engineering to identify sensitive content and rephrase reviews to reduce sensitive disclosures while maintaining informativeness. Empirical results indicate that ChatGPT is capable of reliably assigning sensitivity scores and informativeness scores to user-generated reviews and can be used to generate sanitized reviews that remain informative. 
    more » « less