skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The Impact of Pause and Filler Word Encoding on Dementia Detection with Contrastive Learning
Dementia is primarily caused by neurodegenerative diseases like Alzheimer’s disease (AD). It affects millions worldwide, making detection and monitoring crucial. This study focuses on the detection of dementia from speech transcripts of controls and dementia groups. We propose encoding in-text pauses and filler words (e.g., “uh” and “um”) in text-based language models and thoroughly evaluating their impact on performance (e.g., accuracy). Additionally, we suggest using contrastive learning to improve performance in a multi-task framework. Our results demonstrate the effectiveness of our approaches in enhancing the model’s performance, achieving 87% accuracy and an 86% f1-score. Compared to the state of the art, our approach has similar performance despite having significantly fewer parameters. This highlights the importance of pause and filler word encoding on the detection of dementia.  more » « less
Award ID(s):
2037328 1915599 1160483
PAR ID:
10560979
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
MDPI
Date Published:
Journal Name:
Applied Sciences
Volume:
14
Issue:
19
ISSN:
2076-3417
Page Range / eLocation ID:
8879
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Background:Mild cognitive impairment (MCI) can be an early sign of Alzheimer’s disease and other types of dementia detectable through gait analysis. Curve walking, which demands greater cognitive and motor skills, may be more sensitive in MCI detection than straight walking. However, few studies have compared gait performance in older adults with and without MCI in these conditions. Objective:To compare the capability of curve and straight walking tests for the detection of MCI among older adults. Methods:We employed a Kinect v.2 camera to record the gait of 55 older adults (30 healthy controls, 25 with MCI) during single-task straight and curve walking tests. We examined 50 gait markers and conducted statistical analyses to compare groups and conditions. The trail was approved with protocol No. IR.SEMUMS.REC.1398.237 by the ethics committee of Semnan University of Medical Sciences in Iran. Results:Older adults with MCI exhibited more compromised gait performance, particularly during curve walking. Curve walking outperformed straight walking in MCI detection, with several gait markers showing significant differences between healthy controls and MCI patients. These markers encompass average velocity, cadence, temporal markers (e.g., gait cycle subphase durations), spatial markers (e.g., foot position changes during gait subphases), and spatiotemporal markers (e.g., step and stride velocities). Conclusions:Our study suggests curve walking as a more informative and challenging test for MCI detection among older adults, facilitating early diagnosis using non-invasive, cost-effective tools like the Kinect v.2 camera, complementing cognitive assessments in early diagnosis, and tracking MCI progression to dementia. 
    more » « less
  2. null (Ed.)
    Recent reports of bias in multimedia algorithms (e.g., lesser accuracy of face detection for women and persons of color) have underscored the urgent need to devise approaches which work equally well for different demographic groups. Hence, we posit that ensuring fairness in multimodal cyber-bullying detectors (e.g., equal performance irrespective of the gender of the victim) is an important research challenge. We propose a fairness-aware fusion framework that ensures that both fairness and accuracy remain important considerations when combining data coming from multiple modalities. In this Bayesian fusion framework, the inputs coming from different modalities are combined in a way that is cognizant of the different confidence levels associated with each feature and the interdependencies between features. Specifically, this framework assigns weights to different modalities not just based on accuracy but also their fairness. Results of applying the framework on a multimodal (visual + text) cyberbullying detection problem demonstrate the value of the proposed framework in ensuring both accuracy and fairness. 
    more » « less
  3. Large Language Models have excelled at encoding and leveraging language patterns in large text-based corpora for various tasks, including spatiotemporal event-based question answering (QA). However, due to encoding a text-based projection of the world, they have also been shown to lack a full bodied understanding of such events, e.g., a sense of intuitive physics, and cause-and-effect relationships among events. In this work, we propose using causal event graphs (CEGs) to enhance language understanding of spatiotemporal events in language models, using a novel approach that also provides proofs for the model’s capture of the CEGs. A CEG consists of events denoted by nodes, and edges that denote cause and effect relationships among the events. We perform experimentation and evaluation of our approach for benchmark spatiotemporal QA tasks and show effective performance, both quantitative and qualitative, over state-of-the-art baseline methods. 
    more » « less
  4. Saif, Mehrdad (Ed.)
    This study explores cutting-edge computational technologies and intelligent methods to create realistic synthetic data, focusing on dementia-centric Magnetic Resonance Imaging (MRI) scans related to Alzheimer’s and Parkinson’s diseases. The research delves into Generative Adversarial Networks (GANs), Variational Autoencoders, and Diffusion Models, comparing their efficacy in generating synthetic MRI scans. Using datasets from Alzheimer’s and Parkinson’s patients, the study reveals intriguing findings. In the Alzheimer dataset, diffusion models produced non-dementia images with the lowest Frechet Inception Distance (FID) score at 92.46, while data-efficient GANs excelled in generating dementia images with an FID score of 178.53. In the Parkinson dataset, data-efficient GANs achieved remarkable FID scores of 102.71 for dementia images and 129.77 for non-dementia images. The study also introduces a novel aspect by incorporating a classification study, validating the generative metrics. DenseNets, a deep learning architecture, exhibited superior performance in disease detection compared to ResNets. Training both models on images generated by diffusion models further improved results, with DenseNet achieving accuracies of 80.84% and 92.42% in Alzheimer’s and Parkinson’s disease detection, respectively. The research not only presents innovative generative architectures but also emphasizes the importance of classification metrics, providing valuable insights into the synthesis and detection of neurodegenerative diseases through advanced computational techniques. 
    more » « less
  5. Large Language Models (LLMs) have shown unprecedented performance in various real-world applications. However, they are known to generate factually inaccurate outputs, a.k.a. the hallucination problem. In recent years, incorporating external knowledge extracted from Knowledge Graphs (KGs) has become a promising strategy to improve the factual accuracy of LLM-generated outputs. Nevertheless, most existing explorations rely on LLMs themselves to perform KG knowledge extraction, which is highly inflexible as LLMs can only provide binary judgment on whether a certain knowledge (e.g., a knowledge path in KG) should be used. In addition, LLMs tend to pick only knowledge with direct semantic relationship with the input text, while potentially useful knowledge with indirect semantics can be ignored. In this work, we propose a principled framework KELP with three stages to handle the above problems. Specifically, KELP is able to achieve finer granularity of flexible knowledge extraction by generating scores for knowledge paths with input texts via latent semantic matching. Meanwhile, knowledge paths with indirect semantic relationships with the input text can also be considered via trained encoding between the selected paths in KG and the input text. Experiments on real-world datasets validate the effectiveness of KELP. 
    more » « less