Abstract IntroductionAutomated computational assessment of neuropsychological tests would enable widespread, cost‐effective screening for dementia. MethodsA novel natural language processing approach is developed and validated to identify different stages of dementia based on automated transcription of digital voice recordings of subjects’ neuropsychological tests conducted by the Framingham Heart Study (n= 1084). Transcribed sentences from the test were encoded into quantitative data and several models were trained and tested using these data and the participants’ demographic characteristics. ResultsAverage area under the curve (AUC) on the held‐out test data reached 92.6%, 88.0%, and 74.4% for differentiating Normal cognition from Dementia, Normal or Mild Cognitive Impairment (MCI) from Dementia, and Normal from MCI, respectively. DiscussionThe proposed approach offers a fully automated identification of MCI and dementia based on a recorded neuropsychological test, providing an opportunity to develop a remote screening tool that could be adapted easily to any language. 
                        more » 
                        « less   
                    This content will become publicly available on July 17, 2026
                            
                            Developing an accessible dementia assessment tool: Leveraging a residual network, the trail making test, and demographic data
                        
                    
    
            BackgroundThe global burden of Alzheimer's disease and related dementias is rapidly increasing, particularly in low- and middle-income countries where access to specialized healthcare is limited. Neuropsychological tests are essential diagnostic tools, but their administration requires trained professionals, creating screening barriers. Automated computational assessment presents a cost-effective solution for global dementia screening. ObjectiveTo develop and validate an artificial intelligence-based screening tool using the Trail Making Test (TMT), demographic information, completion times, and drawing analysis for enhanced dementia detection. MethodsWe developed: (1) non-image models using demographics and TMT completion times, (2) image-only models, and (3) fusion models. Models were trained and validated on data from the Framingham Heart Study (FHS) (N = 1252), the Long Life Family Study (LLFS) (N = 1613), and the combined cohort (N = 2865). ResultsOur models, integrating TMT drawings, demographics, and completion times, excelled in distinguishing dementia from normal cognition. In the LLFS cohort, we achieved an Area Under the Receiver Operating Characteristic Curve (AUC) of 98.62%, with sensitivity/specificity of 87.69%/98.26%. In the FHS cohort, we obtained an AUC of 96.51%, with sensitivity/specificity of 85.00%/96.75%. ConclusionsOur method demonstrated superior performance compared to traditional approaches using age and TMT completion time. Adding images captures subtler nuances from the TMT drawing that traditional methods miss. Integrating the TMT drawing into cognitive assessments enables effective dementia screening. Future studies could aim to expand data collection to include more diverse cohorts, particularly from less-resourced regions. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10618452
- Publisher / Repository:
- Sage Journals
- Date Published:
- Journal Name:
- Journal of Alzheimer’s Disease
- ISSN:
- 1387-2877
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Abstract INTRODUCTIONIdentification of individuals with mild cognitive impairment (MCI) who are at risk of developing Alzheimer's disease (AD) is crucial for early intervention and selection of clinical trials. METHODSWe applied natural language processing techniques along with machine learning methods to develop a method for automated prediction of progression to AD within 6 years using speech. The study design was evaluated on the neuropsychological test interviews ofn = 166 participants from the Framingham Heart Study, comprising 90 progressive MCI and 76 stable MCI cases. RESULTSOur best models, which used features generated from speech data, as well as age, sex, and education level, achieved an accuracy of 78.5% and a sensitivity of 81.1% to predict MCI‐to‐AD progression within 6 years. DISCUSSIONThe proposed method offers a fully automated procedure, providing an opportunity to develop an inexpensive, broadly accessible, and easy‐to‐administer screening tool for MCI‐to‐AD progression prediction, facilitating development of remote assessment. HighlightsVoice recordings from neuropsychological exams coupled with basic demographics can lead to strong predictive models of progression to dementia from mild cognitive impairment.The study leveraged AI methods for speech recognition and processed the resulting text using language models.The developed AI‐powered pipeline can lead to fully automated assessment that could enable remote and cost‐effective screening and prognosis for Alzehimer's disease.more » « less
- 
            Bondi, Mark (Ed.)Background: Advantages of digital clock drawing metrics for dementia subtype classification needs examination. Objective: To assess how well kinematic, time-based, and visuospatial features extracted from the digital Clock Drawing Test (dCDT) can classify a combined group of Alzheimer’s disease/Vascular Dementia patients versus healthy controls (HC), and classify dementia patients with Alzheimer’s disease (AD) versus vascular dementia (VaD). Methods: Healthy, community-dwelling control participants (n = 175), patients diagnosed clinically with Alzheimer’s disease (n = 29), and vascular dementia (n = 27) completed the dCDT to command and copy clock drawing conditions. Thirty-seven dCDT command and 37 copy dCDT features were extracted and used with Random Forest classification models. Results: When HC participants were compared to participants with dementia, optimal area under the curve was achieved using models that combined both command and copy dCDT features (AUC = 91.52%). Similarly, when AD versus VaD participants were compared, optimal area under the curve was, achieved with models that combined both command and copy features (AUC = 76.94%). Subsequent follow-up analyses of a corpus of 10 variables of interest determined using a Gini Index found that groups could be dissociated based on kinematic, time-based, and visuospatial features. Conclusion: The dCDT is able to operationally define graphomotor output that cannot be measured using traditional paper and pencil test administration in older health controls and participants with dementia. These data suggest that kinematic, time-based, and visuospatial behavior obtained using the dCDT may provide additional neurocognitive biomarkers that may be able to identify and tract dementia syndromes.more » « less
- 
            OBJECTIVE:To determine biomarkers other than CA 125 that could be used in identifying early-stage ovarian cancer. DATA SOURCES:Ovid MEDLINE ALL, EMBASE, Web of Science Core Collection, ScienceDirect, Clinicaltrials.gov, and CAB Direct were searched for English-language studies between January 2008 and April 2023 for the concepts of high-grade serous ovarian cancer, testing, and prevention or early diagnosis. METHODS OF STUDY SELECTION:The 5,523 related articles were uploaded to Covidence. Screening by two independent reviewers of the article abstracts led to the identification of 245 peer-reviewed primary research articles for full-text review. Full-text review by those reviewers led to the identification of 131 peer-reviewed primary research articles used for this review. TABULATION, INTEGRATION, AND RESULTSOf 131 studies, only 55 reported sensitivity, specificity, or area under the curve (AUC), with 36 of the studies reporting at least one biomarker with a specificity of 80% or greater specificity or 0.9 or greater AUC. CONCLUSION:These findings suggest that although many types of biomarkers are being tested in ovarian cancer, most have similar or worse detection rates compared with CA 125 and have the same limitations of poor detection rates in early-stage disease. However, 27.5% of articles (36/131) reported biomarkers with better sensitivity and an AUC greater than 0.9 compared with CA 125 alone and deserve further exploration.more » « less
- 
            BackgroundRisk-based screening for lung cancer is currently being considered in several countries; however, the optimal approach to determine eligibility remains unclear. Ensemble machine learning could support the development of highly parsimonious prediction models that maintain the performance of more complex models while maximising simplicity and generalisability, supporting the widespread adoption of personalised screening. In this work, we aimed to develop and validate ensemble machine learning models to determine eligibility for risk-based lung cancer screening. Methods and findingsFor model development, we used data from 216,714 ever-smokers recruited between 2006 and 2010 to the UK Biobank prospective cohort and 26,616 high-risk ever-smokers recruited between 2002 and 2004 to the control arm of the US National Lung Screening (NLST) randomised controlled trial. The NLST trial randomised high-risk smokers from 33 US centres with at least a 30 pack-year smoking history and fewer than 15 quit-years to annual CT or chest radiography screening for lung cancer. We externally validated our models among 49,593 participants in the chest radiography arm and all 80,659 ever-smoking participants in the US Prostate, Lung, Colorectal and Ovarian (PLCO) Screening Trial. The PLCO trial, recruiting from 1993 to 2001, analysed the impact of chest radiography or no chest radiography for lung cancer screening. We primarily validated in the PLCO chest radiography arm such that we could benchmark against comparator models developed within the PLCO control arm. Models were developed to predict the risk of 2 outcomes within 5 years from baseline: diagnosis of lung cancer and death from lung cancer. We assessed model discrimination (area under the receiver operating curve, AUC), calibration (calibration curves and expected/observed ratio), overall performance (Brier scores), and net benefit with decision curve analysis.Models predicting lung cancer death (UCL-D) and incidence (UCL-I) using 3 variables—age, smoking duration, and pack-years—achieved or exceeded parity in discrimination, overall performance, and net benefit with comparators currently in use, despite requiring only one-quarter of the predictors. In external validation in the PLCO trial, UCL-D had an AUC of 0.803 (95% CI: 0.783, 0.824) and was well calibrated with an expected/observed (E/O) ratio of 1.05 (95% CI: 0.95, 1.19). UCL-I had an AUC of 0.787 (95% CI: 0.771, 0.802), an E/O ratio of 1.0 (95% CI: 0.92, 1.07). The sensitivity of UCL-D was 85.5% and UCL-I was 83.9%, at 5-year risk thresholds of 0.68% and 1.17%, respectively, 7.9% and 6.2% higher than the USPSTF-2021 criteria at the same specificity. The main limitation of this study is that the models have not been validated outside of UK and US cohorts. ConclusionsWe present parsimonious ensemble machine learning models to predict the risk of lung cancer in ever-smokers, demonstrating a novel approach that could simplify the implementation of risk-based lung cancer screening in multiple settings.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
