skip to main content

This content will become publicly available on October 6, 2022

Title: Ensemble Machine Learning for Alzheimer’s disease Classification from Retinal Vasculature
Introduction: Alzheimer’s disease (AD) causes progressive irreversible cognitive decline and is the leading cause of dementia. Therefore, a timely diagnosis is imperative to maximize neurological preservation. However, current treatments are either too costly or limited in availability. In this project, we explored using retinal vasculature as a potential biomarker for early AD diagnosis. This project focuses on stage 3 of a three-stage modular machine learning pipeline which consisted of image quality selection, vessel map generation, and classification [1]. The previous model only used support vector machine (SVM) to classify AD labels which limited its accuracy to 82%. In this project, random forest and gradient boosting were added and, along with SVM, combined into an ensemble classifier, raising the classification accuracy to 89%. Materials and Methods: Subjects classified as AD were those who were diagnosed with dementia in “Dementia Outcome: Alzheimer’s disease” from the UK Biobank Electronic Health Records. Five control groups were chosen with a 5:1 ratio of control to AD patients where the control patients had the same age, gender, and eye side image as the AD patient. In total, 122 vessel images from each group (AD and control) were used. The vessel maps were then segmented from fundus more » images through U-net. A t-test feature selection was first done on the training folds and the selected features was fed into the classifiers with a p-value threshold of 0.01. Next, 20 repetitions of 5-fold cross validation were performed where the hyperparameters were solely tuned on the training data. An ensemble classifier consisting of SVM, gradient boosting tree, and random forests was built and the final prediction was made through majority voting and evaluated on the test set. Results and Discussion: Through ensemble classification, accuracy increased by 4-12% relative to the individual classifiers, precision by 9-15%, sensitivity by 2-9%, specificity by at least 9-16%, and F1 score by 712%. Conclusions: Overall, a relatively high classification accuracy was achieved using machine learning ensemble classification with SVM, random forest, and gradient boosting. Although the results are very promising, a limitation of this study is that the requirement of needing images of sufficient quality decreased the amount of control parameters that can be implemented. However, through retinal vasculature analysis, this project shows machine learning’s high potential to be an efficient, more cost-effective alternative to diagnosing Alzheimer’s disease. Clinical Application: Using machine learning for AD diagnosis through retinal images will make screening available for a broader population by being more accessible and cost-efficient. Mobile device based screening can also be enabled at primary screening in resource-deprived regions. It can provide a pathway for future understanding of the association between biomarkers in the eye and brain. « less
Award ID(s):
Publication Date:
Journal Name:
Biomedical Engineering Society Annual Meeting
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Alzheimer's disease is the leading cause of dementia. The long progression period in Alzheimer's disease provides a possibility for patients to get early treatment by having routine screenings. However, current clinical diagnostic imaging tools do not meet the specific requirements for screening procedures due to high cost and limited availability. In this work, we took the initiative to evaluate the retina, especially the retinal vasculature, as an alternative for conducting screenings for dementia patients caused by Alzheimer's disease. Highly modular machine learning techniques were employed throughout the whole pipeline. Utilizing data from the UK Biobank, the pipeline achieved anmore »average classification accuracy of 82.44%. Besides the high classification accuracy, we also added a saliency analysis to strengthen this pipeline's interpretability. The saliency analysis indicated that within retinal images, small vessels carry more information for diagnosing Alzheimer's diseases, which aligns with related studies.« less
  2. Purpose: Parkinson’s Disease (PD) is the second most common form of neural degeneration and defined by the decay of dopaminergic cells in the substantia nigra. The current standard for diagnosing PD occurs once 80% of dopaminergic cells have decayed. The degradation of these cells has been shown to create thinning of the retina walls and retina microvasculature. This work serves to find machine learning techniques to provide PD diagnosis using non-invasive fundus eye images. Materials and Methods: Two age and gender matched datasets where constructed using data from the UK Biobank (UKB) and data collected at the University of Floridamore »(UF). The first dataset consists of 476 fundus eye images, 238 CN and 238 PD, sourced entirely from the UKB database. The second dataset, UF-UKB, consist of 100 images, 28 CN and 72 PD, collected at UF and 44 CN images from UKB. A second set of datasets, UKB-Green and UF-UKB-Green, were created using the green color channels to improve vessel segmentation. Vessel segmentation was performed using U-Net segmentation network. The vessel maps served as inputs to SVM classifying networks. Saliency maps were created to assess areas of interest for the networks. Results: The top performing SVM network for the UKB and UKB-Green datasets were the sigmoid SVM networks which achieved accuracies of .698 and .719 respectively. Meanwhile the top performing networks for the UF-UKB and UF-UKB-Green datasets where the linear SVM networks which achieved accuracies of .821 and .857 respectively. The saliency maps indicate that the different networks focused on different vessel structures with the most successful networks focusing more on smaller vessels. Conclusion: The results indicate that the machine learning networks can classify PD based on retina vasculature, with the key features being smaller blood vessels. The proposed methods further support the idea that changes in brain physiology can be observed in the eye. Machine learning networks can be applied to clinically available data and still provide accurate predictions Clinical Relevance statement, not to exceed 200 characters: The work illustrates the feasibility of utilizing eye images as a potential method for diagnosing PD, opposed to the current method of using motor symptoms.« less
  3. Bondi, Mark (Ed.)
    Background: Advantages of digital clock drawing metrics for dementia subtype classification needs examination. Objective: To assess how well kinematic, time-based, and visuospatial features extracted from the digital Clock Drawing Test (dCDT) can classify a combined group of Alzheimer’s disease/Vascular Dementia patients versus healthy controls (HC), and classify dementia patients with Alzheimer’s disease (AD) versus vascular dementia (VaD). Methods: Healthy, community-dwelling control participants (n = 175), patients diagnosed clinically with Alzheimer’s disease (n = 29), and vascular dementia (n = 27) completed the dCDT to command and copy clock drawing conditions. Thirty-seven dCDT command and 37 copy dCDT features were extracted and used with Random Forest classificationmore »models. Results: When HC participants were compared to participants with dementia, optimal area under the curve was achieved using models that combined both command and copy dCDT features (AUC = 91.52%). Similarly, when AD versus VaD participants were compared, optimal area under the curve was, achieved with models that combined both command and copy features (AUC = 76.94%). Subsequent follow-up analyses of a corpus of 10 variables of interest determined using a Gini Index found that groups could be dissociated based on kinematic, time-based, and visuospatial features. Conclusion: The dCDT is able to operationally define graphomotor output that cannot be measured using traditional paper and pencil test administration in older health controls and participants with dementia. These data suggest that kinematic, time-based, and visuospatial behavior obtained using the dCDT may provide additional neurocognitive biomarkers that may be able to identify and tract dementia syndromes.« less
  4. Obeid, I. (Ed.)
    The Neural Engineering Data Consortium (NEDC) is developing the Temple University Digital Pathology Corpus (TUDP), an open source database of high-resolution images from scanned pathology samples [1], as part of its National Science Foundation-funded Major Research Instrumentation grant titled “MRI: High Performance Digital Pathology Using Big Data and Machine Learning” [2]. The long-term goal of this project is to release one million images. We have currently scanned over 100,000 images and are in the process of annotating breast tissue data for our first official corpus release, v1.0.0. This release contains 3,505 annotated images of breast tissue including 74 patients withmore »cancerous diagnoses (out of a total of 296 patients). In this poster, we will present an analysis of this corpus and discuss the challenges we have faced in efficiently producing high quality annotations of breast tissue. It is well known that state of the art algorithms in machine learning require vast amounts of data. Fields such as speech recognition [3], image recognition [4] and text processing [5] are able to deliver impressive performance with complex deep learning models because they have developed large corpora to support training of extremely high-dimensional models (e.g., billions of parameters). Other fields that do not have access to such data resources must rely on techniques in which existing models can be adapted to new datasets [6]. A preliminary version of this breast corpus release was tested in a pilot study using a baseline machine learning system, ResNet18 [7], that leverages several open-source Python tools. The pilot corpus was divided into three sets: train, development, and evaluation. Portions of these slides were manually annotated [1] using the nine labels in Table 1 [8] to identify five to ten examples of pathological features on each slide. Not every pathological feature is annotated, meaning excluded areas can include focuses particular to these labels that are not used for training. A summary of the number of patches within each label is given in Table 2. To maintain a balanced training set, 1,000 patches of each label were used to train the machine learning model. Throughout all sets, only annotated patches were involved in model development. The performance of this model in identifying all the patches in the evaluation set can be seen in the confusion matrix of classification accuracy in Table 3. The highest performing labels were background, 97% correct identification, and artifact, 76% correct identification. A correlation exists between labels with more than 6,000 development patches and accurate performance on the evaluation set. Additionally, these results indicated a need to further refine the annotation of invasive ductal carcinoma (“indc”), inflammation (“infl”), nonneoplastic features (“nneo”), normal (“norm”) and suspicious (“susp”). This pilot experiment motivated changes to the corpus that will be discussed in detail in this poster presentation. To increase the accuracy of the machine learning model, we modified how we addressed underperforming labels. One common source of error arose with how non-background labels were converted into patches. Large areas of background within other labels were isolated within a patch resulting in connective tissue misrepresenting a non-background label. In response, the annotation overlay margins were revised to exclude benign connective tissue in non-background labels. Corresponding patient reports and supporting immunohistochemical stains further guided annotation reviews. The microscopic diagnoses given by the primary pathologist in these reports detail the pathological findings within each tissue site, but not within each specific slide. The microscopic diagnoses informed revisions specifically targeting annotated regions classified as cancerous, ensuring that the labels “indc” and “dcis” were used only in situations where a micropathologist diagnosed it as such. Further differentiation of cancerous and precancerous labels, as well as the location of their focus on a slide, could be accomplished with supplemental immunohistochemically (IHC) stained slides. When distinguishing whether a focus is a nonneoplastic feature versus a cancerous growth, pathologists employ antigen targeting stains to the tissue in question to confirm the diagnosis. For example, a nonneoplastic feature of usual ductal hyperplasia will display diffuse staining for cytokeratin 5 (CK5) and no diffuse staining for estrogen receptor (ER), while a cancerous growth of ductal carcinoma in situ will have negative or focally positive staining for CK5 and diffuse staining for ER [9]. Many tissue samples contain cancerous and non-cancerous features with morphological overlaps that cause variability between annotators. The informative fields IHC slides provide could play an integral role in machine model pathology diagnostics. Following the revisions made on all the annotations, a second experiment was run using ResNet18. Compared to the pilot study, an increase of model prediction accuracy was seen for the labels indc, infl, nneo, norm, and null. This increase is correlated with an increase in annotated area and annotation accuracy. Model performance in identifying the suspicious label decreased by 25% due to the decrease of 57% in the total annotated area described by this label. A summary of the model performance is given in Table 4, which shows the new prediction accuracy and the absolute change in error rate compared to Table 3. The breast tissue subset we are developing includes 3,505 annotated breast pathology slides from 296 patients. The average size of a scanned SVS file is 363 MB. The annotations are stored in an XML format. A CSV version of the annotation file is also available which provides a flat, or simple, annotation that is easy for machine learning researchers to access and interface to their systems. Each patient is identified by an anonymized medical reference number. Within each patient’s directory, one or more sessions are identified, also anonymized to the first of the month in which the sample was taken. These sessions are broken into groupings of tissue taken on that date (in this case, breast tissue). A deidentified patient report stored as a flat text file is also available. Within these slides there are a total of 16,971 total annotated regions with an average of 4.84 annotations per slide. Among those annotations, 8,035 are non-cancerous (normal, background, null, and artifact,) 6,222 are carcinogenic signs (inflammation, nonneoplastic and suspicious,) and 2,714 are cancerous labels (ductal carcinoma in situ and invasive ductal carcinoma in situ.) The individual patients are split up into three sets: train, development, and evaluation. Of the 74 cancerous patients, 20 were allotted for both the development and evaluation sets, while the remain 34 were allotted for train. The remaining 222 patients were split up to preserve the overall distribution of labels within the corpus. This was done in hope of creating control sets for comparable studies. Overall, the development and evaluation sets each have 80 patients, while the training set has 136 patients. In a related component of this project, slides from the Fox Chase Cancer Center (FCCC) Biosample Repository ( -facility) are being digitized in addition to slides provided by Temple University Hospital. This data includes 18 different types of tissue including approximately 38.5% urinary tissue and 16.5% gynecological tissue. These slides and the metadata provided with them are already anonymized and include diagnoses in a spreadsheet with sample and patient ID. We plan to release over 13,000 unannotated slides from the FCCC Corpus simultaneously with v1.0.0 of TUDP. Details of this release will also be discussed in this poster. Few digitally annotated databases of pathology samples like TUDP exist due to the extensive data collection and processing required. The breast corpus subset should be released by November 2021. By December 2021 we should also release the unannotated FCCC data. We are currently annotating urinary tract data as well. We expect to release about 5,600 processed TUH slides in this subset. We have an additional 53,000 unprocessed TUH slides digitized. Corpora of this size will stimulate the development of a new generation of deep learning technology. In clinical settings where resources are limited, an assistive diagnoses model could support pathologists’ workload and even help prioritize suspected cancerous cases. ACKNOWLEDGMENTS This material is supported by the National Science Foundation under grants nos. CNS-1726188 and 1925494. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. REFERENCES [1] N. Shawki et al., “The Temple University Digital Pathology Corpus,” in Signal Processing in Medicine and Biology: Emerging Trends in Research and Applications, 1st ed., I. Obeid, I. Selesnick, and J. Picone, Eds. New York City, New York, USA: Springer, 2020, pp. 67 104. [2] J. Picone, T. Farkas, I. Obeid, and Y. Persidsky, “MRI: High Performance Digital Pathology Using Big Data and Machine Learning.” Major Research Instrumentation (MRI), Division of Computer and Network Systems, Award No. 1726188, January 1, 2018 – December 31, 2021. https://www. [3] A. Gulati et al., “Conformer: Convolution-augmented Transformer for Speech Recognition,” in Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH), 2020, pp. 5036-5040. [4] C.-J. Wu et al., “Machine Learning at Facebook: Understanding Inference at the Edge,” in Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA), 2019, pp. 331–344. [5] I. Caswell and B. Liang, “Recent Advances in Google Translate,” Google AI Blog: The latest from Google Research, 2020. [Online]. Available: [Accessed: 01-Aug-2021]. [6] V. Khalkhali, N. Shawki, V. Shah, M. Golmohammadi, I. Obeid, and J. Picone, “Low Latency Real-Time Seizure Detection Using Transfer Deep Learning,” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2021, pp. 1 7. https://www.isip. [7] J. Picone, T. Farkas, I. Obeid, and Y. Persidsky, “MRI: High Performance Digital Pathology Using Big Data and Machine Learning,” Philadelphia, Pennsylvania, USA, 2020. [8] I. Hunt, S. Husain, J. Simons, I. Obeid, and J. Picone, “Recent Advances in the Temple University Digital Pathology Corpus,” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2019, pp. 1–4. [9] A. P. Martinez, C. Cohen, K. Z. Hanley, and X. (Bill) Li, “Estrogen Receptor and Cytokeratin 5 Are Reliable Markers to Separate Usual Ductal Hyperplasia From Atypical Ductal Hyperplasia and Low-Grade Ductal Carcinoma In Situ,” Arch. Pathol. Lab. Med., vol. 140, no. 7, pp. 686–689, Apr. 2016.« less
  5. Flooding is one of the leading threats of natural disasters to human life and property, especially in densely populated urban areas. Rapid and precise extraction of the flooded areas is key to supporting emergency-response planning and providing damage assessment in both spatial and temporal measurements. Unmanned Aerial Vehicles (UAV) technology has recently been recognized as an efficient photogrammetry data acquisition platform to quickly deliver high-resolution imagery because of its cost-effectiveness, ability to fly at lower altitudes, and ability to enter a hazardous area. Different image classification methods including SVM (Support Vector Machine) have been used for flood extent mapping. Inmore »recent years, there has been a significant improvement in remote sensing image classification using Convolutional Neural Networks (CNNs). CNNs have demonstrated excellent performance on various tasks including image classification, feature extraction, and segmentation. CNNs can learn features automatically from large datasets through the organization of multi-layers of neurons and have the ability to implement nonlinear decision functions. This study investigates the potential of CNN approaches to extract flooded areas from UAV imagery. A VGG-based fully convolutional network (FCN-16s) was used in this research. The model was fine-tuned and a k-fold cross-validation was applied to estimate the performance of the model on the new UAV imagery dataset. This approach allowed FCN-16s to be trained on the datasets that contained only one hundred training samples, and resulted in a highly accurate classification. Confusion matrix was calculated to estimate the accuracy of the proposed method. The image segmentation results obtained from FCN-16s were compared from the results obtained from FCN-8s, FCN-32s and SVMs. Experimental results showed that the FCNs could extract flooded areas precisely from UAV images compared to the traditional classifiers such as SVMs. The classification accuracy achieved by FCN-16s, FCN-8s, FCN-32s, and SVM for the water class was 97.52%, 97.8%, 94.20% and 89%, respectively.« less