skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: ProKnow: Process knowledge for safety constrained and explainable question generation for mental health diagnostic assistance
Virtual Mental Health Assistants (VMHAs) are utilized in health care to provide patient services such as counseling and suggestive care. They are not used for patient diagnostic assistance because they cannot adhere to safety constraints and specialized clinical process knowledge ( ProKnow ) used to obtain clinical diagnoses. In this work, we define ProKnow as an ordered set of information that maps to evidence-based guidelines or categories of conceptual understanding to experts in a domain. We also introduce a new dataset of diagnostic conversations guided by safety constraints and ProKnow that healthcare professionals use ( ProKnow - data ). We develop a method for natural language question generation (NLG) that collects diagnostic information from the patient interactively ( ProKnow - algo ). We demonstrate the limitations of using state-of-the-art large-scale language models (LMs) on this dataset. ProKnow - algo incorporates the process knowledge through explicitly modeling safety, knowledge capture, and explainability. As computational metrics for evaluation do not directly translate to clinical settings, we involve expert clinicians in designing evaluation metrics that test four properties: safety, logical coherence, and knowledge capture for explainability while minimizing the standard cross entropy loss to preserve distribution semantics-based similarity to the ground truth. LMs with ProKnow - algo generated 89% safer questions in the depression and anxiety domain (tested property: safety ). Further, without ProKnow - algo generations question did not adhere to clinical process knowledge in ProKnow - data (tested property: knowledge capture ). In comparison, ProKnow - algo -based generations yield a 96% reduction in our metrics to measure knowledge capture. The explainability of the generated question is assessed by computing similarity with concepts in depression and anxiety knowledge bases. Overall, irrespective of the type of LMs, ProKnow - algo achieved an averaged 82% improvement over simple pre-trained LMs on safety, explainability, and process-guided question generation. For reproducibility, we will make ProKnow - data and the code repository of ProKnow - algo publicly available upon acceptance.  more » « less
Award ID(s):
2133842 1761931
PAR ID:
10429255
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Frontiers in Big Data
Volume:
5
ISSN:
2624-909X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Language models have the potential to assess mental health using social media data. By analyzing online posts and conversations, these models can detect patterns indicating mental health conditions like depression, anxiety, or suicidal thoughts. They examine keywords, language markers, and sentiment to gain insights into an individual’s mental well-being. This information is crucial for early detection, intervention, and support, improving mental health care and prevention strategies. However, using language models for mental health assessments from social media has two limitations: (1) They do not compare posts against clinicians’ diagnostic processes, and (2) It’s challenging to explain language model outputs using concepts that the clinician can understand, i.e., clinician-friendly explanations. In this study, we introduce Process Knowledge-infused Learning (PK-iL), a new learning paradigm that layers clinical process knowledge structures on language model outputs, enabling clinician-friendly explanations of the underlying language model predictions. We rigorously test our methods on existing benchmark datasets, augmented with such clinical process knowledge, and release a new dataset for assessing suicidality. PKiL performs competitively, achieving a 70% agreement with users, while other XAI methods only achieve 47% agreement (average inter-rater agreement of 0.72). Our evaluations demonstrate that PK-iL effectively explains model predictions to clinicians. 
    more » « less
  2. Conversational Agents (CAs) powered with deep language models (DLMs) have shown tremendous promise in the domain of mental health. Prominently, the CAs have been used to provide informational or therapeutic services (e.g., cognitive behavioral therapy) to patients. However, the utility of CAs to assist in mental health triaging has not been explored in the existing work as it requires a controlled generation of follow-up questions (FQs), which are often initiated and guided by the mental health professionals (MHPs) in clinical settings. In the context of ‘depression’, our experiments show that DLMs coupled with process knowledge in a mental health questionnaire generate 12.54% and 9.37% better FQs based on similarity and longest common subsequence matches to questions in the PHQ-9 dataset respectively, when compared with DLMs without process knowledge support.Despite coupling with process knowledge, we find that DLMs are still prone to hallucination, i.e., generating redundant, irrelevant, and unsafe FQs. We demonstrate the challenge of using existing datasets to train a DLM for generating FQs that adhere to clinical process knowledge. To address this limitation, we prepared an extended PHQ-9 based dataset, PRIMATE, in collaboration with MHPs. PRIMATE contains annotations regarding whether a particular question in the PHQ-9 dataset has already been answered in the user’s initial description of the mental health condition. We used PRIMATE to train a DLM in a supervised setting to identify which of the PHQ-9 questions can be answered directly from the user’s post and which ones would require more information from the user. Using performance analysis based on MCC scores, we show that PRIMATE is appropriate for identifying questions in PHQ-9 that could guide generative DLMs towards controlled FQ generation (with minimal hallucination) suitable for aiding triaging. The dataset created as a part of this research can be obtained from https://github.com/primate-mh/Primate2022 
    more » « less
  3. Abstract ObjectiveThe use of electronic health records (EHRs) for clinical risk prediction is on the rise. However, in many practical settings, the limited availability of task-specific EHR data can restrict the application of standard machine learning pipelines. In this study, we investigate the potential of leveraging language models (LMs) as a means to incorporate supplementary domain knowledge for improving the performance of various EHR-based risk prediction tasks. MethodsWe propose two novel LM-based methods, namely “LLaMA2-EHR” and “Sent-e-Med.” Our focus is on utilizing the textual descriptions within structured EHRs to make risk predictions about future diagnoses. We conduct a comprehensive comparison with previous approaches across various data types and sizes. ResultsExperiments across 6 different methods and 3 separate risk prediction tasks reveal that employing LMs to represent structured EHRs, such as diagnostic histories, results in significant performance improvements when evaluated using standard metrics such as area under the receiver operating characteristic (ROC) curve and precision-recall (PR) curve. Additionally, they offer benefits such as few-shot learning, the ability to handle previously unseen medical concepts, and adaptability to various medical vocabularies. However, it is noteworthy that outcomes may exhibit sensitivity to a specific prompt. ConclusionLMs encompass extensive embedded knowledge, making them valuable for the analysis of EHRs in the context of risk prediction. Nevertheless, it is important to exercise caution in their application, as ongoing safety concerns related to LMs persist and require continuous consideration. 
    more » « less
  4. Abstract Medical devices play a crucial role in modern healthcare, providing innovative solutions for diagnosing, preventing, monitoring, and treating ailments. Artificial Intelligence is transforming the field of medical devices, offering unprecedented opportunities through diagnostic accuracy, personalized treatment plans, and enhancing patient outcomes. This review outlines the applications of artificial intelligence-based medical devices in healthcare specialties, especially in dentistry, medical imaging, ophthalmology, mental health, autism spectrum disorder diagnosis, oncology, and general medicine. Specifically, the review highlights advancements such as improved diagnostic accuracy, tailored treatment planning, and enhanced clinical outcomes in the above-mentioned applications. Regulatory approval remains a key issue, where medical devices must be approved or cleared by the United States Food and Drug Administration to establish their safety and efficacy. The regulatory guidance pathway for artificial intelligence-based medical devices is presented and moreover the critical technical, ethical, and implementation challenges that must be addressed for large-scale adoption are discussed. The review concludes that the intersection of artificial intelligence with the medical device domain and internet-enabled or enhanced technology, such as biotechnology, nanotechnology, and personalized therapeutics, enables an enormous opportunity to accelerate customized and patient-centered care. By evaluating these advancements and challenges, the study aims to present insights into the future trajectory of smart medical technologies and their role in advancing personalized, patient-centered care. 
    more » « less
  5. Abstract Although digital health solutions are increasingly popular in clinical psychiatry, one application that has not been fully explored is the utilization of survey technology to monitor patients outside of the clinic. Supplementing routine care with digital information collected in the “clinical whitespace” between visits could improve care for patients with severe mental illness. This study evaluated the feasibility and validity of using online self-report questionnaires to supplement in-person clinical evaluations in persons with and without psychiatric diagnoses. We performed a rigorous in-person clinical diagnostic and assessment battery in 54 participants with schizophrenia (N = 23), depressive disorder (N = 14), and healthy controls (N = 17) using standard assessments for depressive and psychotic symptomatology. Participants were then asked to complete brief online assessments of depressive (Quick Inventory of Depressive Symptomatology) and psychotic (Community Assessment of Psychic Experiences) symptoms outside of the clinic for comparison with the ground-truth in-person assessments. We found that online self-report ratings of severity were significantly correlated with the clinical assessments for depression (two assessments used: R = 0.63, p < 0.001; R = 0.73, p < 0.001) and psychosis (R = 0.62, p < 0.001). Our results demonstrate the feasibility and validity of collecting psychiatric symptom ratings through online surveys. Surveillance of this kind may be especially useful in detecting acute mental health crises between patient visits and can generally contribute to more comprehensive psychiatric treatment. 
    more » « less