Integration of Multimodal Data from Disparate Sources for Identifying Disease Subtypes

Zhou, Kaiyue; Kottoori, Bhagya Shree; Munj, Seeya Awadhut; Zhang, Zhewei; Draghici, Sorin; Arslanturk, Suzan

doi:10.3390/biology11030360

Citation Details

Integration of Multimodal Data from Disparate Sources for Identifying Disease Subtypes

Studies over the past decade have generated a wealth of molecular data that can be leveraged to better understand cancer risk, progression, and outcomes. However, understanding the progression risk and differentiating long- and short-term survivors cannot be achieved by analyzing data from a single modality due to the heterogeneity of disease. Using a scientifically developed and tested deep-learning approach that leverages aggregate information collected from multiple repositories with multiple modalities (e.g., mRNA, DNA Methylation, miRNA) could lead to a more accurate and robust prediction of disease progression. Here, we propose an autoencoder based multimodal data fusion system, in which a fusion encoder flexibly integrates collective information available through multiple studies with partially coupled data. Our results on a fully controlled simulation-based study have shown that inferring the missing data through the proposed data fusion pipeline allows a predictor that is superior to other baseline predictors with missing modalities. Results have further shown that short- and long-term survivors of glioblastoma multiforme, acute myeloid leukemia, and pancreatic adenocarcinoma can be successfully differentiated with an AUC of 0.94, 0.75, and 0.96, respectively. more »

Award ID(s):: 1948338

PAR ID:: 10480570

Author(s) / Creator(s):: Zhou, Kaiyue; Kottoori, Bhagya Shree; Munj, Seeya Awadhut; Zhang, Zhewei; Draghici, Sorin; Arslanturk, Suzan

Publisher / Repository:: mdpi

Date Published:: 2022-03-01

Journal Name:: Biology

Volume:: 11

Issue:: 3

ISSN:: 2079-7737

Page Range / eLocation ID:: 360

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.3390/biology11030360

More Like this