Language models have the potential to assess mental health using social media data. By analyzing online posts and conversations, these models can detect patterns indicating mental health conditions like depression, anxiety, or suicidal thoughts. They examine keywords, language markers, and sentiment to gain insights into an individual’s mental well-being. This information is crucial for early detection, intervention, and support, improving mental health care and prevention strategies. However, using language models for mental health assessments from social media has two limitations: (1) They do not compare posts against clinicians’ diagnostic processes, and (2) It’s challenging to explain language model outputs using concepts that the clinician can understand, i.e., clinician-friendly explanations. In this study, we introduce Process Knowledge-infused Learning (PK-iL), a new learning paradigm that layers clinical process knowledge structures on language model outputs, enabling clinician-friendly explanations of the underlying language model predictions. We rigorously test our methods on existing benchmark datasets, augmented with such clinical process knowledge, and release a new dataset for assessing suicidality. PKiL performs competitively, achieving a 70% agreement with users, while other XAI methods only achieve 47% agreement (average inter-rater agreement of 0.72). Our evaluations demonstrate that PK-iL effectively explains model predictions to clinicians.
more »
« less
ProKnow: Process knowledge for safety constrained and explainable question generation for mental health diagnostic assistance
Virtual Mental Health Assistants (VMHAs) are utilized in health care to provide patient services such as counseling and suggestive care. They are not used for patient diagnostic assistance because they cannot adhere to safety constraints and specialized clinical process knowledge ( ProKnow ) used to obtain clinical diagnoses. In this work, we define ProKnow as an ordered set of information that maps to evidence-based guidelines or categories of conceptual understanding to experts in a domain. We also introduce a new dataset of diagnostic conversations guided by safety constraints and ProKnow that healthcare professionals use ( ProKnow - data ). We develop a method for natural language question generation (NLG) that collects diagnostic information from the patient interactively ( ProKnow - algo ). We demonstrate the limitations of using state-of-the-art large-scale language models (LMs) on this dataset. ProKnow - algo incorporates the process knowledge through explicitly modeling safety, knowledge capture, and explainability. As computational metrics for evaluation do not directly translate to clinical settings, we involve expert clinicians in designing evaluation metrics that test four properties: safety, logical coherence, and knowledge capture for explainability while minimizing the standard cross entropy loss to preserve distribution semantics-based similarity to the ground truth. LMs with ProKnow - algo generated 89% safer questions in the depression and anxiety domain (tested property: safety ). Further, without ProKnow - algo generations question did not adhere to clinical process knowledge in ProKnow - data (tested property: knowledge capture ). In comparison, ProKnow - algo -based generations yield a 96% reduction in our metrics to measure knowledge capture. The explainability of the generated question is assessed by computing similarity with concepts in depression and anxiety knowledge bases. Overall, irrespective of the type of LMs, ProKnow - algo achieved an averaged 82% improvement over simple pre-trained LMs on safety, explainability, and process-guided question generation. For reproducibility, we will make ProKnow - data and the code repository of ProKnow - algo publicly available upon acceptance.
more »
« less
- PAR ID:
- 10429255
- Date Published:
- Journal Name:
- Frontiers in Big Data
- Volume:
- 5
- ISSN:
- 2624-909X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Conversational Agents (CAs) powered with deep language models (DLMs) have shown tremendous promise in the domain of mental health. Prominently, the CAs have been used to provide informational or therapeutic services (e.g., cognitive behavioral therapy) to patients. However, the utility of CAs to assist in mental health triaging has not been explored in the existing work as it requires a controlled generation of follow-up questions (FQs), which are often initiated and guided by the mental health professionals (MHPs) in clinical settings. In the context of ‘depression’, our experiments show that DLMs coupled with process knowledge in a mental health questionnaire generate 12.54% and 9.37% better FQs based on similarity and longest common subsequence matches to questions in the PHQ-9 dataset respectively, when compared with DLMs without process knowledge support.Despite coupling with process knowledge, we find that DLMs are still prone to hallucination, i.e., generating redundant, irrelevant, and unsafe FQs. We demonstrate the challenge of using existing datasets to train a DLM for generating FQs that adhere to clinical process knowledge. To address this limitation, we prepared an extended PHQ-9 based dataset, PRIMATE, in collaboration with MHPs. PRIMATE contains annotations regarding whether a particular question in the PHQ-9 dataset has already been answered in the user’s initial description of the mental health condition. We used PRIMATE to train a DLM in a supervised setting to identify which of the PHQ-9 questions can be answered directly from the user’s post and which ones would require more information from the user. Using performance analysis based on MCC scores, we show that PRIMATE is appropriate for identifying questions in PHQ-9 that could guide generative DLMs towards controlled FQ generation (with minimal hallucination) suitable for aiding triaging. The dataset created as a part of this research can be obtained from https://github.com/primate-mh/Primate2022more » « less
-
Abstract ObjectiveThe use of electronic health records (EHRs) for clinical risk prediction is on the rise. However, in many practical settings, the limited availability of task-specific EHR data can restrict the application of standard machine learning pipelines. In this study, we investigate the potential of leveraging language models (LMs) as a means to incorporate supplementary domain knowledge for improving the performance of various EHR-based risk prediction tasks. MethodsWe propose two novel LM-based methods, namely “LLaMA2-EHR” and “Sent-e-Med.” Our focus is on utilizing the textual descriptions within structured EHRs to make risk predictions about future diagnoses. We conduct a comprehensive comparison with previous approaches across various data types and sizes. ResultsExperiments across 6 different methods and 3 separate risk prediction tasks reveal that employing LMs to represent structured EHRs, such as diagnostic histories, results in significant performance improvements when evaluated using standard metrics such as area under the receiver operating characteristic (ROC) curve and precision-recall (PR) curve. Additionally, they offer benefits such as few-shot learning, the ability to handle previously unseen medical concepts, and adaptability to various medical vocabularies. However, it is noteworthy that outcomes may exhibit sensitivity to a specific prompt. ConclusionLMs encompass extensive embedded knowledge, making them valuable for the analysis of EHRs in the context of risk prediction. Nevertheless, it is important to exercise caution in their application, as ongoing safety concerns related to LMs persist and require continuous consideration.more » « less
-
Deep learning models have demonstrated impressive accuracy in predicting acute kidney injury (AKI), a condition affecting up to 20% of ICU patients, yet their black-box nature prevents clinical adoption in high-stakes critical care settings. While existing interpretability methods like SHAP, LIME, and attention mechanisms can identify important features, they fail to capture the temporal dynamics essential for clinical decision-making, and are unable to communicate when specific risk factors become critical in a patient's trajectory. This limitation is particularly problematic in the ICU, where the timing of interventions can significantly impact patient outcomes. We present a novel interpretable framework that brings temporal awareness to deep learning predictions for AKI. Our approach introduces three key innovations: (1) a latent convolutional concept bottleneck that learns clinically meaningful patterns from ICU time-series without requiring manual concept annotation, leveraging Conv1D layers to capture localized temporal patterns like sudden physiological changes; (2) Temporal Concept Tracing (TCT), a gradient-based method that identifies not only which risk factors matter but precisely when they become critical addressing the fundamental question of temporal relevance missing from current XAI techniques; and (3) integration with MedAlpaca to generate structured, time-aware clinical explanations that translate model insights into actionable bedside guidance. We evaluate our framework on MIMIC-IV data, demonstrating that our approach performs better than existing explainability frameworks, Occlusion and LIME, in terms of the comprehensiveness score, sufficiency score, and processing time. The proposed method also better captures risk factors inflection points for patients timelines compared to conventional concept bottleneck methods, including dense layer and attention mechanism. This work represents the first comprehensive solution for interpretable temporal deep learning in critical care that addresses both the what and when of clinical risk factors. By making AKI predictions transparent and temporally contextualized, our framework bridges the gap between model accuracy and clinical utility, offering a path toward trustworthy AI deployment in time-sensitive healthcare settings.more » « less
-
Abstract Medical devices play a crucial role in modern healthcare, providing innovative solutions for diagnosing, preventing, monitoring, and treating ailments. Artificial Intelligence is transforming the field of medical devices, offering unprecedented opportunities through diagnostic accuracy, personalized treatment plans, and enhancing patient outcomes. This review outlines the applications of artificial intelligence-based medical devices in healthcare specialties, especially in dentistry, medical imaging, ophthalmology, mental health, autism spectrum disorder diagnosis, oncology, and general medicine. Specifically, the review highlights advancements such as improved diagnostic accuracy, tailored treatment planning, and enhanced clinical outcomes in the above-mentioned applications. Regulatory approval remains a key issue, where medical devices must be approved or cleared by the United States Food and Drug Administration to establish their safety and efficacy. The regulatory guidance pathway for artificial intelligence-based medical devices is presented and moreover the critical technical, ethical, and implementation challenges that must be addressed for large-scale adoption are discussed. The review concludes that the intersection of artificial intelligence with the medical device domain and internet-enabled or enhanced technology, such as biotechnology, nanotechnology, and personalized therapeutics, enables an enormous opportunity to accelerate customized and patient-centered care. By evaluating these advancements and challenges, the study aims to present insights into the future trajectory of smart medical technologies and their role in advancing personalized, patient-centered care.more » « less
An official website of the United States government

