skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Can machine learning predict late seizures after intracerebral hemorrhages? Evidence from real-world data
Introduction Intracerebral hemorrhage represents 15 % of all strokes and it is associated with a high risk of post-stroke epilepsy. However, there are no reliable methods to accurately predict those at higher risk for developing seizures despite their importance in planning treatments, allocating resources, and advancing post-stroke seizure research. Existing risk models have limitations and have not taken advantage of readily available real-world data and artificial intelligence. This study aims to evaluate the performance of Machine-learning-based models to predict post-stroke seizures at 1 year and 5 years after an intracerebral hemorrhage in unselected patients across multiple healthcare organizations. Design/methods We identified patients with intracerebral hemorrhage (ICH) without a prior diagnosis of seizures from 2015 until inception (11/01/22) in the TriNetX Diamond Network, using the International Classification of Diseases, Tenth Revision (ICD-10) I61 (I61.0, I61.1, I61.2, I61.3, I61.4, I61.5, I61.6, I61.8, and I61.9). The outcome of interest was any ICD-10 diagnosis of seizures (G40/G41) at 1 year and 5 years following the first occurrence of the diagnosis of intracerebral hemorrhage. We applied a conventional logistic regression and a Light Gradient Boosted Machine (LGBM) algorithm, and the performance of the model was assessed using the area under the receiver operating characteristics (AUROC), the area under the precision-recall curve (AUPRC), the F1 statistic, model accuracy, balanced-accuracy, precision, and recall, with and without seizure medication use in the models. Results A total of 85,679 patients had an ICD-10 code of intracerebral hemorrhage and no prior diagnosis of seizures, constituting our study cohort. Seizures were present in 4.57 % and 6.27 % of patients within 1 and 5 years after ICH, respectively. At 1-year, the AUROC, AUPRC, F1 statistic, accuracy, balanced-accuracy, precision, and recall were respectively 0.7051 (standard error: 0.0132), 0.1143 (0.0068), 0.1479 (0.0055), 0.6708 (0.0076), 0.6491 (0.0114), 0.0839 (0.0032), and 0.6253 (0.0216). Corresponding metrics at 5 years were 0.694 (0.009), 0.1431 (0.0039), 0.1859 (0.0064), 0.6603 (0.0059), 0.6408 (0.0119), 0.1094 (0.0037) and 0.6186 (0.0264). These numerical values indicate that the statistical models fit the data very well.  more » « less
Award ID(s):
2226025
PAR ID:
10638667
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
Elsevier
Date Published:
Journal Name:
Epilepsy Behavior
ISSN:
1525-5069
Page Range / eLocation ID:
https://www.sciencedirect.com/science/article/pii/S1525505024002166
Subject(s) / Keyword(s):
Electronic Health Records Machine Learning epilepsy
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. To test the hypothesis that accuracy, discrimination, and precision in predicting postoperative complications improve when using both preoperative and intraoperative data input features versus preoperative data alone. Models that predict postoperative complications often ignore important intraoperative physiological changes. Incorporation of intraoperative physiological data may improve model performance. This retrospective cohort analysis included 52,529 inpatient surgeries at a single institution during a 5 year period. Random forest machine learning models in the validated MySurgeryRisk platform made patient-level predictions for three postoperative complications and mortality during hospital admission using electronic health record data and patient neighborhood characteristics. For each outcome, one model trained with preoperative data alone and one model trained with both preoperative and intraoperative data. Models were compared by accuracy, discrimination (expressed as AUROC), precision (expressed as AUPRC), and reclassification indices (NRI). Machine learning models incorporating both preoperative and intraoperative data had greater accuracy, discrimination, and precision than models using preoperative data alone for predicting all three postoperative complications (intensive care unit length of stay >48 hours, mechanical ventilation >48 hours, and neurological complications including delirium) and in-hospital mortality (accuracy: 88% vs. 77%, AUROC: 0.93 vs. 0.87, AUPRC: 0.21 vs. 0.15). Overall reclassification improvement was 2.9-10.0% for complications and 11.2% for in-hospital mortality. Incorporating both preoperative and intraoperative data significantly increased accuracy, discrimination, and precision for machine learning models predicting postoperative complications. 
    more » « less
  2. Abstract Purpose. To investigate the relationship between spatial parotid dose and the risk of xerostomia in patients undergoing head-and-neck cancer radiotherapy, using machine learning (ML) methods.Methods. Prior to conducting voxel-based ML analysis of the spatial dose, two steps were taken: (1) The parotid dose was standardized through deformable image registration to a reference patient; (2) Bilateral parotid doses were regrouped into contralateral and ipsilateral portions depending on their proximity to the gross tumor target. Individual dose voxels were input into six commonly used ML models, which were tuned with ten-fold cross validation: random forest (RF), ridge regression (RR), support vector machine (SVM), extra trees (ET), k-nearest neighbor (kNN), and naïve Bayes (NB). Binary endpoints from 240 patients were used for model training and validation: 0 (N = 119) for xerostomia grades 0 or 1, and 1 (N = 121) for grades 2 or higher. Model performance was evaluated using multiple metrics, including accuracy, F1score, areas under the receiver operating characteristics curves (auROC), and area under the precision–recall curves (auPRC). Dose voxel importance was assessed to identify local dose patterns associated with xerostomia risk.Results. Four models, including RF, SVM, ET, and NB, yielded average auROCs and auPRCs greater than 0.60 from ten-fold cross-validation on the training data, except for a lower auROC from NB. The first three models, along with kNN, demonstrated higher accuracy and F1scores. A bootstrapping analysis confirmed test uncertainty. Voxel importance analysis from kNN indicated that the posterior portion of the ipsilateral gland was more predictive of xerostomia, but no clear patterns were identified from the other models.Conclusion. Voxel doses as predictors of xerostomia were confirmed with some ML classifiers, but no clear regional patterns could be established among these classifiers, except kNN. Further research with a larger patient dataset is needed to identify conclusive patterns. 
    more » « less
  3. Ischemic stroke is the most common type of stroke and thrombolytic therapy is the only approved treatment. However, current thrombolytic therapy with tissue plasminogen activator (tPA) is often hampered by the increased risk of hemorrhage. Plasmin, a direct fibrinolytic, has a significantly superior hemostatic safety profile; however, if injected intravenously it becomes rapidly inactivated by anti-plasmin. Nanoformulations have been shown to increase drug stability and half-life and hence could be applied to increase the plasmin therapeutic efficacy. Here in this paper, we report a novel heparin and arginine-based plasmin nanoformulation that exhibits increased plasmin stability and efficacy. In vitro studies revealed significant plasmin stability in the presence of anti-plasmin and efficient fibrinolytic activity. In addition, these particles showed no significant toxicity or oxidative stress effects in human brain microvascular endothelial cells, and no significant blood brain barrier permeability. Further, in a mouse photothrombotic stroke model, plasmin nanoparticles exhibited significant efficacy in reducing stroke volume without overt intracerebral hemorrhage (ICH) compared to free plasmin treatment. The study shows the potential of a plasmin nanoformulation in ischemic stroke therapy. 
    more » « less
  4. Epileptic seizures are dangerous. They render patients unconscious and can lead to death within seconds of onset. There is, therefore, the need for a very fast and accurate seizure detection mechanism. Kriging methods have been used extensively in geostatistics for spatial prediction and are known for very high accuracy. By modeling the brain as a spatial map, we demonstrate the effectiveness of Kriging Methods for efficient seizure detection in an edge computing paradigm. We explore three different types of Kriging - Simple Kriging, Ordinary Kriging and Universal Kriging. Results from various experiments with electroencephalogram (EEG) signals of both healthy and diseased patients show that all three Kriging methods have good performance in terms of accuracy, sensitivity and detection latency. However, Simple Kriging emerged as the slight favorite for seizure detection with a mean detection latency of 0.81 sec, an accuracy of 97.50%, a sensitivity of 94.74% and a perfect specificity. Simple Kriging is at least 5% better than Ordinary Kriging and Universal Kriging when evaluated at 68.2% confidence interval. The results obtained in this paper compare favorably with other seizure detection models in the literature. 
    more » « less
  5. null (Ed.)
    Objective: To demonstrate that combining automatic processing of EEG data using high performance machine learning algorithms with manual review by expert annotators can quickly identify subjects with prolonged seizures. Background: Prolonged seizures are markers of seizure severity, risk of transformation into status epilepticus, and medical morbidity. Early recognition of prolonged seizures permits intervention and reduces morbidity. Design/Methods: We triaged the TUH EEG Corpus, an open source database of EEGs, by running a state-of-the-art hybrid LSTM-based deep learning system. Then, we postprocessed the output to identify high confidence hypotheses for seizures that were greater than three minutes in duration. Results: The triaging method selected 25 subjects for further review. 17 subjects had seizures; only 5 met criteria for seizures greater than 3 minutes. 11 subjects did not have a prior diagnosis of epilepsy. Among these, 63% had acute respiratory failure and 36% had cardiac arrest leading to seizures secondary to anoxic brain injury. 18 (72%) EEGs were obtained in long-term monitoring (LTM), 1 (4%) in the epilepsy monitoring unit (EMU), and 6 (24%) as a routine EEG (rEEG). 72.2% of seizures in LTM were identified correctly versus 66.7% in rEEGs. Of the 9 subjects who were deceased, 7 (78%) had been on LTM. The seizure detection algorithm misidentified seizures in 7 subjects (28%). A total of 22 (88%) subjects had some ictal pattern. Patterns mistaken for seizure activity included muscle artifact, generalized periodic discharges, generalized spike-and-wave, triphasic waves, and interestingly, an EEG recording captured during CPR. Conclusions: This hybrid approach, which combines state-of-the-art machine learning seizure detection software with human annotation, successfully identified prolonged seizures in 72% of subjects; 88% had ictal patterns. Prolonged seizures were more common in LTM subjects than the EMU and were associated with acute cardiac or pulmonary insult. 
    more » « less