Measuring Disease Similarity Based on Multiple Heterogeneous Disease Information Networks

Ling Tian, Jianliang Gao

Quantifying the similarities between diseases is now playing an important role in biology and medicine, which provides reliable reference information in finding similar diseases. Most of the previous methods for similarity calculation between diseases either use a single-source data or do not fully utilize multi-sources data. In this study, we propose an approach to measure disease similarity by utilizing multiple heterogeneous disease information networks. Firstly, multiple disease-related data sources are formulated as heterogeneous disease information networks which include various types of objects such as disease, pathway, and chemicals. Then, the corresponding subgraphs of these heterogeneous disease information networks are obtained by filtering vertices. Topological scores and semantics scores are calculated in these heterogenous subgraphs using Dynamic Time Warping (DTW) algorithm and meta path method respectively. In this way, we transform multiple heterogeneous disease networks to a homogeneous disease network with different weights on the edges. Finally, the disease nodes can be embedded according to the weights and the similarity between diseases can then be calculated using these n-dimensional vectors. Experiments based on benchmark set fully demonstrate the effectiveness of our method in measuring the similarity of diseases through multisources data. Index Terms

More Like this