<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>GIRUS-net: A Multimodal Deep Learning Model Identifying Imaging and Genetic Biomarkers Linked to Alzheimer’s Disease Severity</title></titleStmt>
			<publicationStmt>
				<publisher>IEEE</publisher>
				<date>07/24/2023</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10492646</idno>
					<idno type="doi">10.1109/EMBC40787.2023.10341000</idno>
					<title level='j'>Proceedings of the annual Conference on Engineering in Medicine and Biology</title>
<idno>0589-1019</idno>
<biblScope unit="volume"></biblScope>
<biblScope unit="issue"></biblScope>					

					<author>Sarah Wu</author><author>Archana Venkataraman</author><author>Sayan Ghosal</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[We introduce an explainable deep neural architecture that combines brain structure with genetic influence to improve disease severity prediction in Alzheimer's disease. Our framework consists of an encoder, a decoder, and a rankconsistent ordinal regression module. The encoder projects neural imaging and genetics data into a low-dimensional latent space regularized by the decoder. The ordinal regression module guides the feature embedding process to find discriminative patterns representative of disease severity. We also add a learnable dropout layer that learns feature importance and extracts explainable biomarkers from the data. We evaluate our model using structural MRI (sMRI) and Single Nucleotide Polymorphism (SNP) data provided by the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. In 2-class severity classification comparison, our model has a median F-score of 0.86 (baseline median F-score range: 0.57-0.81). In 3-class classification comparison, our model's median F-score is 0.50 (baseline range: 0.17 -0.41). In 4-class classification comparison, our model's median F-score is 0.40 (baseline range: 0.14 -0.39). We demonstrate that our model provides improved disease diagnosis alongside sparse and clinically relevant biomarkers.Clinical relevanceÐThis study provides a deep-learning model that can predict Alzheimer's disease severity levels while identifying consistent and clinically relevant biomarkers.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. INTRODUCTION</head><p>Alzheimer's disease (AD) is a neurodegenerative disorder common in the elderly population <ref type="bibr">[1]</ref>. Patients develop mild cognitive impairment, which progresses to dementia. AD is characterized by gradual loss of brain cells, also known as brain atrophy, which can be detected through structural magnetic resonance imaging (sMRI) <ref type="bibr">[2]</ref>. Genetic factors also play a significant role in disease development <ref type="bibr">[3]</ref> and progression <ref type="bibr">[4]</ref>. Genetic risk factors, such as single nucleotide polymorphisms (SNPs), help pinpoint mutations in the DNA that influence the pathophysiology <ref type="bibr">[5]</ref> of AD. Most research disentangles AD mechanisms by studying the neural influence and genetic factors separately. However, separating the data modalities may provide an incomplete picture of the underlying biological process <ref type="bibr">[6]</ref>.</p><p>Imaging-genetics studies integrate neuroimaging and genetic data to improve disease prediction <ref type="bibr">[7]</ref>. Imaging features are often derived from structural and functional MRI *Data used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI provided data but did not participate in analysis or writing of this report.</p><p>(s/fMRI), and genetic variants are typically captured by SNPs. Data-driven imaging-genetics methods can be grouped into four main categories: simple regression, nonlinear methods, correlation methods, and deep learning approaches. The first category uses linear models like SVM <ref type="bibr">[8]</ref>, <ref type="bibr">[9]</ref> and Logistic Regression <ref type="bibr">[10]</ref> for AD classification. However, these methods typically train on single data modality and fail to discover interactions between modalities. The second category leverages gradient boosting <ref type="bibr">[11]</ref>, <ref type="bibr">[12]</ref> and decision trees <ref type="bibr">[13]</ref> for multi-class classification. These models can encode nonlinear interactions between the features, but they fail to disentangle the neuroimaging and genetic pathways linked to AD. The third category uses correlation analysis to identify associations between genetic variations and quantitative traits <ref type="bibr">[14]</ref>, <ref type="bibr">[15]</ref>, <ref type="bibr">[16]</ref>. However, these models do not incorporate clinical diagnosis directly. Thus, the biomarkers obtained through such analysis may not align with the predictive group differences. The last category relies on deep learning architectures to combine high dimensional, structured data for imaging-genetic analysis <ref type="bibr">[17]</ref>. Deep learning frameworks are highly complex, lacking model explainablility. Recent work such as the Genetic and Multimodal Imaging data using Neural-network Designs (G-MIND) framework can identify predictive biomarkers of a disease from imaging and genetic modalities <ref type="bibr">[18]</ref>. However, G-MIND performs classification and cannot accommodate the progression of AD severity.</p><p>We introduce a novel framework to combine Genetic and Imaging data using Rank-consistent mUltimodal multiclaSs network (GIRUS-net) that identifies neuroimaging and genetics biomarkers for AD diagnosis <ref type="bibr">[18]</ref>. This work extends the G-MIND model and uses a rank-consistent ordinal regression module <ref type="bibr">[19]</ref> to track disease severity from imaging and genetics data. Thus, GIRUS-net can identify biomarkers that are associated with the progressive representation of disease severity. GIRUS-net consists of an autoencoder coupled with an ordinal regression module. The encoders combine the imaging and genetics features into latent space and pass it through the ordinal regression module for disease severity prediction. We introduce a binary mask with binary concrete prior <ref type="bibr">[20]</ref> as feature selection layer for biomarker detection. On a population study of AD, GIRUS-net yields sparser and more consistent biomarkers than baselines methods, while maintaining competitive classification performance. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>II. METHODS</head><p>Figure <ref type="figure">1</ref> illustrates our GIRUS-net framework. The inputs are the genotype data g n &#8712; R G&#215;1 for each subject n and the corresponding imaging features i n &#8712; R I&#215;1 . The class labels y n &#8712; {1, 2, 3, 4} corresponds to different severity of AD: cognitive normal (CN), early mild cognitive impairment (EMCI), late mild cognitive impairment (LMCI), and mild Alzheimer's disease (AD). The diagnosis (phenotype) y n is known during training but not during testing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Encoder-Decoder Framework</head><p>We jointly model the imaging and genetic data using an autoencoder, coupled with an ordinal regression module.</p><p>a) Bayesian Feature Selection: The first layer of the encoder incorporates Bayesian feature selection using a learnable dropout layer. Unlike Bernoulli dropout where the underlying probability is fixed a priori, here we parameterize the dropout layer using Gumbell-Softmax distribution. The reparameterization trick relaxes the standard binary dropout to a continuous representation, which allows us to learn the posterior probability of the binary vectors. Mathematically, the subject specific dropout masks z m n are generated as:</p><p>(1) where m indexes the data modality, p m represents the underlying importance map, t captures the extent of relaxation from Bernoulli dropout, and u i n is a random vector sampled from U nif orm(0, 1) for stochasticity. During each forward pass and for every subject n, the network randomly samples z i n for imaging and z g n for genetics, respectively. The continuous representation of the dropout masks allow us to learn the underlying importance maps during training. We also incorporate a KL divergence loss KL(Ber(q)&#8741;Ber(p m )) to enforce sparsity in p m . As seen in Eq. ( <ref type="formula">1</ref>), higher values in p m are representative of the most selected features and can be identified as potential biomarkers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>b) Multimodal Latent Fusion and Decoder:</head><p>The imaging and genetic features are passed through the learnable dropout layers to two separate encoders to obtain latent embeddings. These embeddings are then fused to leverage the common structure shared between both modalities. Mathematically, the fusion operation is</p><p>where E i (&#8226;), E g (&#8226;) denote the encoding operations for imaging and genetics. After fusion, the latent vectors &#8467; n are passed through the decoders D i (&#8226;) and D g (&#8226;) to reconstruct the imaging and genetic data, ensuring that information is preserved during encoding. The reconstruction loss is an L 2 loss between the input and the reconstructed outputs:</p><p>(3</p><p>where B is the batch size, and &#955; 1 , &#955; 2 capture relative contribution of the loss terms.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Rank Consistent Ordinal Regression</head><p>Our auto-encoder is coupled with an ordinal regression module to predict the level of disease severity. The regression module ensures that the latent embeddings and the dropout masks learn discriminative information from the data. a) Rank Consistent Prediction: Our regression module consists of a sequence of fully connected layers which predicts the disease severity level from the latent embeddings. Mathematically, the prediction probabilities are calculated as</p><p>where &#375;n is the predicted class label, Y(&#8226;) is the fully connected layers parameterized by W, and b k are separate biases associated with each severity level. To ensure rank consistency among prediction probabilities (i.e., P (&#375; n = k) &gt; P (&#375; n = K + 1)), we need to ensure that b k &gt; b k+1 . Previously, the work of <ref type="bibr">[19]</ref> has shown that rank consistency can be achieved using multi-label cross entropy loss:</p><p>where B is batch size, y</p><p>is a binary multi-class vector generated from y n , and &#375;n is the predicted label.</p><p>Combining Eqs. (1-5), the GIRUS-net loss function is:</p><p>where &#955; 1 , &#955; 2 capture the contribution of the reconstruction losses, &#955; 3 controls the contribution of ordinal loss and &#952; m captures the relative contribution of the sparsity penalties.</p><p>b) Prediction on New Data: During testing, the imaging and genetic data are multiplied by the dropout probabilities and passed through the encoder. The latent encoding then pass through the ordinal regression module for disease severity prediction. Disease severity is predicted by</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Implementation and Parameter Sweep</head><p>The model parameters, {&#955; 1 , &#955; 2 , &#955; 3 , &#955; 4 }, are selected so that individual loss terms lie within the same order of magnitude. This criterion is model agnostic and does not require us to optimize them. We weighted each sparsity penalty by &#952; m to adjust for difference in number of features: &#952; i = number of Genetic Features number of Imaging Features , and &#952; g = 1. The learning rate and batch size are fixed based on validation performance in a 10fold cross validation setting. We perform grid search over learning rate &#8712; [0.00001, 0.001] and batch size &#8712; <ref type="bibr">[8,</ref><ref type="bibr">128]</ref>. For all experiments, we fixed the Bernoulli probability to q = 0.0001, temperature t = 0.1, and batch size = 32. Model parameters were set to &#955; 1 = 0.0001, &#955; 2 = 0.001, &#955; 3 = 0.5, and &#955; 4 = 0.0001, where &#955; 1 , &#955; 2 , &#955; 3 were scaled to be on the same scale as sparsity penalty. Learning rate is 0.001 for 2class and 0.0005 for 3-class and 4-class experiments. Fig. <ref type="figure">1</ref> shows additional architecture details.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>D. Baseline Methods</head><p>We compare GIRUS-net with four standard baseline models that operate on the concatenated data modalities, i.e.</p><p>x = [i T , g T ]. Hyperparameters are fixed using a grid search approach in a 10-fold cross validation setting.</p><p>a) Random Forest Classifier (RF): Random Forest is an ensemble method that constructs decision trees and average their outputs to provide a robust and accurate prediction. Feature importance is calculated as mean decrease in information gain <ref type="bibr">[21]</ref> associated with each feature.</p><p>b) Support Vector Machine (SVM): Support vector machines create a hyperplane that maximally separates the data belonging to two different classes. Here, we extend the linear SVM by building K(K-1) 2 separate binary &#186;one vs. one&#186; classifiers to perform multi-class prediction <ref type="bibr">[22]</ref>. We construct the feature importance map by taking the mean of the absolute values of weights of the linear kernels.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>c) Artificial Neural Network (ANN):</head><p>We train an ANN to perform classification based on input x. ANNs can model complex patterns from the data, but usually lack feature explainability, particularly for deeper networks. Thus, the feature importance maps are calculated post hoc using Shapley Additive Explanations <ref type="bibr">[23]</ref>.</p><p>d) Ordered Logit Model (O-Logit): O-Logit is a generalized linear model that performs ordinal regression <ref type="bibr">[24]</ref>. Here, we learn a linear layer of weights w and bias b 1 &#8226; &#8226; &#8226; b K-1 . Similar to the implementation of GIRUS-net, we extended each label y n to K-1 labels, y</p><p>The predicted probability for the severity level are given by</p><p>We train O-Logit with the multi-label cross entropy loss in Eq. ( <ref type="formula">5</ref>). We use the learned weights w as feature importances. We evaluate all the models for classification performance and biomarkers explainability. The biomarkers are identified by assessing the feature importance maps. For GIRUS-net, imaging and genetic feature importance maps are calculated from p i , p g and the classification predictions are the output of the ordinal regression branch.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>E. Evaluation Strategy</head><p>a) Model Prediction Evaluation: We perform 10 repeats of stratified 10 fold cross validation. We evaluate the classification performance based on accuracy, f1-score (F1), Cohen's kappa (Kappa) <ref type="bibr">[25]</ref>, recall, and precision.</p><p>b) Feature Importance Evaluation: The feature importance maps are evaluated based on sparsity and consistency across classification task. We re-scaled the feature importance maps to [0, 1]. For all baseline methods, we scale imaging and genetic features importance maps collectively because imaging and genetic data are concatenated during training. For GIRUS-net, we scale the two feature maps separately as they come from separate branches of the model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>III. RESULTS</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Data Credit</head><p>Data used in the preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database <ref type="bibr">[26]</ref>. The ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment and early Alzheimer's disease.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Data Preprocessing</head><p>The subjects in this study are included from the ADNI2/GO database. The subjects are classified as cognitive normal (CN), early mild cognitive impairment (EMCI), late mild cognitive impairment (LMCI), or mild Alzheimer's disease (AD) based on ADNI2 protocol. Table <ref type="table">1</ref> summarizes the demographics of the 934 subjects, which contains both MRI and genetic data. The data are pre-processed and provided as a part of the TADPOLE challenge <ref type="bibr">[27]</ref>.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>a) MRI Imaging Data:</head><p>The T1-weighted MRI imaging data are collected using a 3T scanner. The data are processed with gradient non-linearity, B1 non-uniformity correction and peak sharpening <ref type="bibr">[28]</ref>. This study uses cross-sectional cortical thickness as imaging features, which are extracted via Freesurfer <ref type="bibr">[29]</ref>. The imaging features consists of 68 brain regions of interest (ROIs), with 34 features from the right hemisphere and 34 from the left, based on the Desikan-Killiany atlas <ref type="bibr">[30]</ref>. As an additional preprocessing step, we normalize all testing, validation, and training imaging data with the mean and standard deviation of the training data.</p><p>b) Genetics Data: In parallel, genotyping was done with GenomeStudio v2009.1 (Illumina). Quality control was performed using PLINK, resulting in 141912 Linkage Disequilibrium (LD) independent SNPs. We subselect 1165 SNPs by thresholding the p-value p &#8804; 0.001 based on an auxiliary genome-wide associated study data (GWAS) <ref type="bibr">[31]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Classification Performance</head><p>Fig. <ref type="figure">2</ref> quantifies the classification performance of all the methods across the three experimental setups. Compared to the baselines, GIRUS-net is consistently showing better, or comparable performance across all the performance metrics. The confusion matrices in Fig. <ref type="figure">3</ref> further show that the baseline models fail to handle class imbalance. Especially in 3-class and 4-class scenarios, the baselines tend to predict all subjects as the majority class(es). In comparison, GIRUSnet can successfully distribute its predictions across class labels. The improved performance suggests that GIRUS-net can extract discriminative patterns of disease severity.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>D. Feature Importance Sparsity and Consistency</head><p>The imaging and genetics biomarkers are identified by the mean feature importance maps across 10 repeats of 10-fold cross validations. As shown in Fig. <ref type="figure">4</ref>, the baseline models rely mainly on one modality for diagnosis: imaging for RF, SVM, and ANN; genetics for Ordered Logit. In comparison, GIRUS-net puts equal importance on both the data modalities and extract a sparse set of biomarkers.</p><p>Additionally, the low importance scores demonstrate that these baseline methods fail to jointly extract discriminative information from both the data modalities. The superior performance of GIRUS-net suggests that the learnable dropout layers can selectively find brain and genetics biomarkers that are crucial for the downstream severity prediction.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>E. Analysis of Imaging Biomarkers</head><p>We average the imaging feature importance maps learned from GIRUS-net across the three classification experiments. In Fig. <ref type="figure">5</ref>, we plot top 10 regions onto the brain with colors corresponding to the value of feature importance. Our model identifies brain regions including lateral ventricle, medial temporal lobe, inferior temporal lobe, and parahippocampal gyrus. Correspondingly, AD is characterized by enlarged ventricles <ref type="bibr">[32]</ref> and loss of tissue in inferior parietal gyrus <ref type="bibr">[33]</ref> and parahippocampal gyrus <ref type="bibr">[34]</ref>. Functionally, the hippocampus is crucial for episodic and spatial memory <ref type="bibr">[35]</ref> which is affected by AD. Overall, GIRUS-net identifies brain regions with high association to AD in the literature.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>F. Analysis of Genetics Biomarkers</head><p>Fig. <ref type="figure">6</ref> shows the mean of the genetic feature selection maps identified by the learnable dropout layer across the three classification experiments. A higher value indicates genetic variants containing discriminative information about all the classification tasks and are potential AD risk loci. We annotate the top 10 SNPs and their overlapping or affected genes as listed in The Ensemble Variant Effect Predictor and the GWAS Catalog <ref type="bibr">[36]</ref>, <ref type="bibr">[37]</ref>. Our model identifies well established Alzheimer's risk factors such as TOMM40 <ref type="bibr">[38]</ref> and APOE <ref type="bibr">[39]</ref> with high feature importance. Using the GTEx database, we identify the set of brain tissues where the set of genes show high expression levels. Aside    from APOE and TOMM40, we identify NDUFA4, which plays a regulatory role in the expression of synaptophysin in the hippocampus, and gene mutation could potentially lead to AD <ref type="bibr">[40]</ref>. Aside from already established genes, several intergenic and non-coding SNPs are also selected with high feature importance by GIRUS-net . Their association to AD is unknown. The explainability study demonstrated that the genetic biomarkers identified by GIRUS-net aligns with research findings and may assist in future genetic analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IV. CONCLUSIONS</head><p>In this paper, we introduce GIRUS-net, a deep learning framework for multi-modal fusion, biomarker extraction, and severity prediction for AD. As compared to prior work, we introduce a rank-consistent ordinal regression module that extracts discriminative features that have a progressive effect on disease severity. In a population study of AD, GIRUSnet successfully integrates imaging-genetic data for disease severity prediction. Compared to standard baselines, GIRUSnet extracts consistent, distinctive, and clinically relevant information from imaging and genetic features across various complex diagnosis tasks. In addition, GIRUS-net is not tied to any specific data modality; the flexible design allows the user to combine diverse data modalities and provide a comprehensive view of various diseases. </p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>Authorized licensed use limited to: BOSTON UNIVERSITY. Downloaded on February 27,2024 at 16:33:03 UTC from IEEE Xplore. Restrictions apply.</p></note>
		</body>
		</text>
</TEI>
