skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Differentiating ischemic stroke patients from healthy subjects using a large-scale, retrospective EEG database and machine learning methods
Objectives: We set out to develop a machine learning model capable of distinguishing patients presenting with ischemic stroke from a healthy cohort of subjects. The model relies on a 3-min resting electroencephalogram (EEG) recording from which features can be computed. Materials and methods: Using a large-scale, retrospective database of EEG recordings and matching clinical reports, we were able to construct a dataset of 1385 healthy subjects and 374 stroke patients. With subjects often producing more than one recording per session, the final dataset consisted of 2401 EEG recordings (63% healthy, 37% stroke). Results: Using a rich set of features encompassing both the spectral and temporal domains, our model yielded an AUC of 0.95, with a sensitivity and specificity of 93% and 86%, respectively. Allowing for multiple recordings per subject in the training set boosted sensitivity by 7%, attributable to a more balanced dataset. Conclusions: Our work demonstrates strong potential for the use of EEG in conjunction with machine learning methods to distinguish stroke patients from healthy subjects. Our approach provides a solution that is not only timely (3-minutes recording time) but also highly precise and accurate (AUC: 0.95). Keywords: Electroencephalogram (EEG); Feature engineering; Ischemic stroke; Large vessel occlusion; Machine learning; Prehospital stroke scale.  more » « less
Award ID(s):
2213156
PAR ID:
10540688
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
J Stroke Cerebrovasc Dis
Date Published:
Journal Name:
Journal of Stroke and Cerebrovascular Diseases
Volume:
33
Issue:
6
ISSN:
1052-3057
Page Range / eLocation ID:
107714
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract BackgroundAtrial fibrillation (AF) is often asymptomatic and thus under-observed. Given the high risks of stroke and heart failure among patients with AF, early prediction and effective management are crucial. Importantly, obstructive sleep apnea is highly prevalent among AF patients (60–90%); therefore, electrocardiogram (ECG) analysis from polysomnography (PSG), a standard diagnostic tool for subjects with suspected sleep apnea, presents a unique opportunity for the early prediction of AF. Our goal is to identify individuals at a high risk of developing AF in the future from a single-lead ECG recorded during standard PSGs. MethodsWe analyzed 18,782 single-lead ECG recordings from 13,609 subjects at Massachusetts General Hospital, identifying AF presence using ICD-9/10 codes in medical records. Our dataset comprises 15,913 recordings without a medical record for AF and 2,056 recordings from patients who were first diagnosed with AF between 1 day to 15 years after the PSG recording. The PSG data were partitioned into training, validation, and test cohorts. In the first phase, a signal quality index (SQI) was calculated in 30-second windows and those with SQI<0.95 were removed. From each remaining window, 150 hand-crafted features were extracted from time, frequency, time-frequency domains, and phase-space reconstructions of the ECG. A compilation of 12 statistical features summarized these window-specific features per recording, resulting in 1,800 features. We then updated a pre-trained deep neural network and data from the PhysioNet Challenge 2021 using transfer-learning to discriminate between recordings with and without AF using the same Challenge data. The model was applied to the PSG ECGs in 16-second windows to generate the probability of AF for each window. From the resultant probability sequence, 13 statistical features were extracted. Subsequently, we trained a shallow neural network to predict future AF using the extracted ECG and probability features. ResultsOn the test set, our model demonstrated a sensitivity of 0.67, specificity of 0.81, and precision of 0.3 for predicting AF. Further, survival analysis for AF outcomes, using the log-rank test, revealed a hazard ratio of 8.36 (p-value of 1.93 × 10−52). ConclusionsOur proposed ECG analysis method, utilizing overnight PSG data, shows promise in AF prediction despite a modest precision indicating the presence of false positive cases. This approach could potentially enable low-cost screening and proactive treatment for high-risk patients. Ongoing refinement, such as integrating additional physiological parameters could significantly reduce false positives, enhancing its clinical utility and accuracy. 
    more » « less
  2. As our population ages, neurological impairments and degeneration of the musculoskeletal system yield gait abnormalities, which can significantly reduce quality of life. Gait rehabilitative therapy has been widely adopted to help patients maximize community participation and living independence. To further improve the precision and efficiency of rehabilitative therapy, more objective methods need to be developed based on sensory data. In this paper, an algorithmic framework is proposed to provide classification of gait disorders caused by two common neurological diseases, stroke and Parkinson's Disease (PD), from ground contact force (GCF) data. An advanced machine learning method, multi-task feature learning (MTFL), is used to jointly train classification models of a subject's gait in three classes, post-stroke, PD and healthy gait. Gait parameters related to mobility, balance, strength and rhythm are used as features for the classification. Out of all the features used, the MTFL models capture the more important ones per disease, which will help provide better objective assessment and therapy progress tracking. To evaluate the proposed methodology we use data from a human participant study, which includes five PD patients, three post-stroke patients, and three healthy subjects. Despite the diversity of abnormalities, the evaluation shows that the proposed approach can successfully distinguish post-stroke and PD gait from healthy gait, as well as post-stroke from PD gait, with Area Under the Curve (AUC) score of at least 0.96. Moreover, the methodology helps select important gait features to better understand the key characteristics that distinguish abnormal gaits and design personalized treatment. 
    more » « less
  3. Objective Sudden unexpected death in epilepsy (SUDEP) is the leading cause of epilepsy-related mortality. Although lots of effort has been made in identifying clinical risk factors for SUDEP in the literature, there are few validated methods to predict individual SUDEP risk. Prolonged postictal EEG suppression (PGES) is a potential SUDEP biomarker, but its occurrence is infrequent and requires epilepsy monitoring unit admission. We use machine learning methods to examine SUDEP risk using interictal EEG and ECG recordings from SUDEP cases and matched living epilepsy controls. Methods This multicenter, retrospective, cohort study examined interictal EEG and ECG recordings from 30 SUDEP cases and 58 age-matched living epilepsy patient controls. We trained machine learning models with interictal EEG and ECG features to predict the retrospective SUDEP risk for each patient. We assessed cross-validated classification accuracy and the area under the receiver operating characteristic (AUC) curve. Results The logistic regression (LR) classifier produced the overall best performance, outperforming the support vector machine (SVM), random forest (RF), and convolutional neural network (CNN). Among the 30 patients with SUDEP [14 females; mean age (SD), 31 (8.47) years] and 58 living epilepsy controls [26 females (43%); mean age (SD) 31 (8.5) years], the LR model achieved the median AUC of 0.77 [interquartile range (IQR), 0.73–0.80] in five-fold cross-validation using interictal alpha and low gamma power ratio of the EEG and heart rate variability (HRV) features extracted from the ECG. The LR model achieved the mean AUC of 0.79 in leave-one-center-out prediction. Conclusions Our results support that machine learning-driven models may quantify SUDEP risk for epilepsy patients, future refinements in our model may help predict individualized SUDEP risk and help clinicians correlate predictive scores with the clinical data. Low-cost and noninvasive interictal biomarkers of SUDEP risk may help clinicians to identify high-risk patients and initiate preventive strategies. 
    more » « less
  4. Sleep is known to promote recovery post-stroke. However, there is a paucity of data profiling sleep oscillations in the post-stroke human brain. Recent rodent work showed that resurgence of physiologic spindles coupled to sleep slow oscillations (SOs) and concomitant decrease in pathological delta (δ) waves is associated with sustained motor performance gains during stroke recovery. The goal of this study was to evaluate bilaterality of non-rapid eye movement (NREM) sleep-oscillations (namely SOs,δ-waves, spindles, and their nesting) in post-stroke patients vs. healthy control subjects. We analyzed NREM-marked electroencephalography (EEG) data in hospitalized stroke-patients (n = 5) and healthy subjects (n = 3). We used a laterality index to evaluate symmetry of NREM oscillations across hemispheres. We found that stroke subjects had pronounced asymmetry in the oscillations, with a predominance of SOs,δ-waves, spindles, and nested spindles in affected hemisphere, when compared to the healthy subjects. Recent preclinical work classified SO-nested spindles as restorative post-stroke andδ-wave-nested spindles as pathological. We found that the ratio of SO-nested spindles laterality index toδ-wave-nested spindles laterality index was lower in stroke subjects. Using linear mixed models (which included random effects of concurrent pharmacologic drugs), we found large and medium effect size forδ-wave nested spindle and SO-nested spindle, respectively. Our results in this pilot study indicate that considering laterality index of NREM oscillations might be a useful metric for assessing recovery post-stroke and that factoring in pharmacologic drugs may be important when targeting sleep modulation for neurorehabilitation post-stroke. 
    more » « less
  5. This paper investigates scalp electroencephalogram (EEG) data from 14 subjects with unilateral prefrontal cortex (pFC) lesions and 20 healthy controls during lateral visuospatial working memory (WM) tasks. The goal is to differentiate the brain networks involved in WM processing between these groups. The EEG recordings are transformed into graph signals, with proximity-weighted brain connectivity measures as edges and centrality measures as nodal features. Graph convolutional network (GCN) layers are used for feature representation, followed by a fully connected layer for classification. The GCN-based model effectively handles nine classification tasks, proving that graph-based network representation is versatile for describing brain interactions. The sparse MI-GCI-based graph model’s accuracy effectively captures the functional segregation of distinct WM tasks. The classifier using mutual information-guided Granger causality index (MI-GCI) with 20% of top edges matched prior classification performance with 67% fewer parameters and 80% less graph density, identifying the correct class of all 34 subjects in group identification using leave-one-out cross-validation and two-thirds majority voting. 
    more » « less