Accurate and efficient gene function prediction using a multi-bacterial network

Law, Jeffrey N; Kale, Shiv D; Murali, T M

doi:10.1093/bioinformatics/btaa885

Citation Details

Accurate and efficient gene function prediction using a multi-bacterial network

Abstract Motivation Nearly 40% of the genes in sequenced genomes have no experimentally or computationally derived functional annotations. To fill this gap, we seek to develop methods for network-based gene function prediction that can integrate heterogeneous data for multiple species with experimentally based functional annotations and systematically transfer them to newly sequenced organisms on a genome-wide scale. However, the large sizes of such networks pose a challenge for the scalability of current methods. Results We develop a label propagation algorithm called FastSinkSource. By formally bounding its rate of progress, we decrease the running time by a factor of 100 without sacrificing accuracy. We systematically evaluate many approaches to construct multi-species bacterial networks and apply FastSinkSource and other state-of-the-art methods to these networks. We find that the most accurate and efficient approach is to pre-compute annotation scores for species with experimental annotations, and then to transfer them to other organisms. In this manner, FastSinkSource runs in under 3 min for 200 bacterial species. Availability and implementation An implementation of our framework and all data used in this research are available at https://github.com/Murali-group/multi-species-GOA-prediction. Supplementary information Supplementary data are available at Bioinformatics online. more »

Award ID(s):: 1759858 1817736 1617678

PAR ID:: 10279602

Author(s) / Creator(s):: Law, Jeffrey N; Kale, Shiv D; Murali, T M

Editor(s):: Lenore, Cowen

Date Published:: 2020-10-16

Journal Name:: Bioinformatics

Volume:: 37

Issue:: 6

ISSN:: 1367-4803

Page Range / eLocation ID:: 800 to 806

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1093/bioinformatics/btaa885

More Like this