skip to main content


Title: Machine learning identifies abnormal Ca2+ transients in human induced pluripotent stem cell-derived cardiomyocytes
Abstract

Human-induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) provide an excellent platform for potential clinical and research applications. Identifying abnormal Ca2+transients is crucial for evaluating cardiomyocyte function that requires labor-intensive manual effort. Therefore, we develop an analytical pipeline for automatic assessment of Ca2+transient abnormality, by employing advanced machine learning methods together with an Analytical Algorithm. First, we adapt an existing Analytical Algorithm to identify Ca2+transient peaks and determine peak abnormality based on quantified peak characteristics. Second, we train a peak-level Support Vector Machine (SVM) classifier by using human-expert assessment of peak abnormality as outcome and profiled peak variables as predictive features. Third, we train another cell-level SVM classifier by using human-expert assessment of cell abnormality as outcome and quantified cell-level variables as predictive features. This cell-level SVM classifier can be used to assess additional Ca2+transient signals. By applying this pipeline to our Ca2+transient data, we trained a cell-level SVM classifier using 200 cells as training data, then tested its accuracy in an independent dataset of 54 cells. As a result, we obtained 88% training accuracy and 87% test accuracy. Further, we provide a free R package to implement our pipeline for high-throughput CM Ca2+analysis.

 
more » « less
Award ID(s):
1926387
NSF-PAR ID:
10308455
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Scientific Reports
Volume:
10
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Introduction: Alzheimer’s disease (AD) causes progressive irreversible cognitive decline and is the leading cause of dementia. Therefore, a timely diagnosis is imperative to maximize neurological preservation. However, current treatments are either too costly or limited in availability. In this project, we explored using retinal vasculature as a potential biomarker for early AD diagnosis. This project focuses on stage 3 of a three-stage modular machine learning pipeline which consisted of image quality selection, vessel map generation, and classification [1]. The previous model only used support vector machine (SVM) to classify AD labels which limited its accuracy to 82%. In this project, random forest and gradient boosting were added and, along with SVM, combined into an ensemble classifier, raising the classification accuracy to 89%. Materials and Methods: Subjects classified as AD were those who were diagnosed with dementia in “Dementia Outcome: Alzheimer’s disease” from the UK Biobank Electronic Health Records. Five control groups were chosen with a 5:1 ratio of control to AD patients where the control patients had the same age, gender, and eye side image as the AD patient. In total, 122 vessel images from each group (AD and control) were used. The vessel maps were then segmented from fundus images through U-net. A t-test feature selection was first done on the training folds and the selected features was fed into the classifiers with a p-value threshold of 0.01. Next, 20 repetitions of 5-fold cross validation were performed where the hyperparameters were solely tuned on the training data. An ensemble classifier consisting of SVM, gradient boosting tree, and random forests was built and the final prediction was made through majority voting and evaluated on the test set. Results and Discussion: Through ensemble classification, accuracy increased by 4-12% relative to the individual classifiers, precision by 9-15%, sensitivity by 2-9%, specificity by at least 9-16%, and F1 score by 712%. Conclusions: Overall, a relatively high classification accuracy was achieved using machine learning ensemble classification with SVM, random forest, and gradient boosting. Although the results are very promising, a limitation of this study is that the requirement of needing images of sufficient quality decreased the amount of control parameters that can be implemented. However, through retinal vasculature analysis, this project shows machine learning’s high potential to be an efficient, more cost-effective alternative to diagnosing Alzheimer’s disease. Clinical Application: Using machine learning for AD diagnosis through retinal images will make screening available for a broader population by being more accessible and cost-efficient. Mobile device based screening can also be enabled at primary screening in resource-deprived regions. It can provide a pathway for future understanding of the association between biomarkers in the eye and brain. 
    more » « less
  2. Background

    CD8+T cell in pancreatic ductal adenocarcinoma (PDAC) is closely related to the prognosis and treatment response of patients. Accurate preoperative CD8+T‐cell expression can better identify the population benefitting from immunotherapy.

    Purpose

    To develop and validate a machine learning classifier based on noncontrast magnetic resonance imaging (MRI) for the preoperative prediction of CD8+T‐cell expression in patients with PDAC.

    Study Type

    Retrospective cohort study.

    Population

    Overall, 114 patients with PDAC undergoing MR scan and surgical resection; 97 and 47 patients in the training and validation cohorts.

    Field Strength/Sequence/3 T

    Breath‐hold single‐shot fast‐spin echo T2‐weighted sequence and noncontrast T1‐weighted fat‐suppressed sequences.

    Assessment

    CD8+T‐cell expression was quantified using immunohistochemistry. For each patient, 2232 radiomics features were extracted from noncontrast T1‐ and T2‐weighted images and reduced using the Wilcoxon rank‐sum test and least absolute shrinkage and selection operator method. Linear discriminative analysis was used to construct radiomics and mixed models. Model performance was determined by its discriminative ability, calibration, and clinical utility.

    Statistical Tests

    Kaplan–Meier estimates, Student's t‐test, the Kruskal–Wallis H test, and the chi‐square test, receiver operating characteristic curve, and decision curve analysis.

    Results

    A log‐rank test showed that the survival duration in the CD8‐high group (25.51 months) was significantly longer than that in the CD8‐low group (22.92 months). The mixed model included all MRI characteristics and 13 selected radiomics features, and the area under the curve (AUC) was 0.89 (95% confidence interval [CI], 0.77–0.92) and 0.69 (95% CI, 0.53–0.82) in the training and validation cohorts. The radiomics model included 13 radiomics features, which showed good discrimination in the training cohort (AUC, 0.85; 95% CI, 0.77–0.92) and the validation cohort (AUC, 0.76; 95% CI, 0.61–0.87).

    Data Conclusions

    This study developed a noncontrast MRI‐based radiomics model that can preoperatively determine CD8+T‐cell expression in patients with PDAC and potentially immunotherapy planning.

    Evidence Level

    5

    Technical Efficacy

    Stage 2

     
    more » « less
  3. Abstract

    Cellular automat​on (CA) are important tools that provide insight into urbanization dynamics and possible future patterns. The calibration process is the core theme of these models. This study compares the performance of two common machine‐learning classifiers, random forest (RF), and support vector machines (SVM), to calibrate CA. It focuses on the sensitivity analysis of the sample size and the number of input variables for each classifier. We applied the models to the Wallonia region (Belgium) as a case study to demonstrate the performance of each classifier. The results highlight that RF produces a land‐use pattern that simulates the observed pattern more precisely than SVM especially with a low sample size, which is important for study areas with low levels of land‐use change. Although zoning information notably enhances the accuracy of SVM‐based probability maps, zoning marginally influences the RF‐derived probability maps. In the case of the SVM, the CA model did not significantly improve due to the increased sample size. The performance of the 5,000 sample size was observed to be better than the 15,000 sample size. The RF‐driven CA had the best performance with a high sample, while zoning information was excluded.

     
    more » « less
  4. Abstract

    Systems neuroscience is still mainly a neuronal field, despite the plethora of evidence supporting the fact that astrocytes modulate local neural circuits, networks, and complex behaviors. In this article, we sought to identify which types of studies are necessary to establish whether astrocytes, beyond their well‐documented homeostatic and metabolic functions, perform computations implementing mathematical algorithms that sub‐serve coding and higher‐brain functions. First, we reviewed Systems‐like studies that include astrocytes in order to identify computational operations that these cells may perform, using Ca2+transients as their encoding language. The analysis suggests that astrocytes may carry out canonical computations in a time scale of subseconds to seconds in sensory processing, neuromodulation, brain state, memory formation, fear, and complex homeostatic reflexes. Next, we propose a list of actions to gain insight into the outstanding question of which variables are encoded by such computations. The application of statistical analyses based on machine learning, such as dimensionality reduction and decoding in the context of complex behaviors, combined with connectomics of astrocyte–neuronal circuits, is, in our view, fundamental undertakings. We also discuss technical and analytical approaches to study neuronal and astrocytic populations simultaneously, and the inclusion of astrocytes in advanced modeling of neural circuits, as well as in theories currently under exploration such as predictive coding and energy‐efficient coding. Clarifying the relationship between astrocytic Ca2+and brain coding may represent a leap forward toward novel approaches in the study of astrocytes in health and disease.

     
    more » « less
  5. Traumatic brain injury (TBI) is a massive public health problem worldwide. Accurate and fast automatic brain hematoma segmentation is important for TBI diagnosis, treatment and outcome prediction. In this study, we developed a fully automated system to detect and segment hematoma regions in head Computed Tomography (CT) images of patients with acute TBI. We first over-segmented brain images into superpixels and then extracted statistical and textural features to capture characteristics of superpixels. To overcome the shortage of annotated data, an uncertainty-based active learning strategy was designed to adaptively and iteratively select the most informative unlabeled data to be annotated for training a Support Vector Machine classifier (SVM). Finally, the coarse segmentation from the SVM classifier was incorporated into an active contour model to improve the accuracy of the segmentation. From our experiments, the proposed active learning strategy can achieve a comparable result with 5 times fewer labeled data compared with regular machine learning. Our proposed automatic hematoma segmentation system achieved an average Dice coefficient of 0.60 on our dataset, where patients are from multiple health centers and at multiple levels of injury. Our results show that the proposed method can effectively overcome the challenge of limited and highly varied dataset. 
    more » « less