ImputeCC enhances integrative Hi-C-based metagenomic binning through constrained random-walk-based imputation

Du, Yuxuan; Zuo, Wenxuan; Sun, Fengzhu

Citation Details

Metagenomic Hi-C (metaHi-C) enables the recognition of relationships between contigs in terms of their physical proximity within the same cell, facilitating the reconstruction of high-quality metagenomeassembled genomes (MAGs) from complex microbial communities. However, current Hi-C-based contig binning methods solely depend on Hi-C interactions between contigs to group them, ignoring invaluable biological information, including the presence of single-copy marker genes. Here, we introduce ImputeCC, an integrative contig binning tool tailored for metaHi-C datasets. ImputeCC integrates Hi-C interactions with the inherent discriminative power of single-copy marker genes, initially clustering them as preliminary bins, and develops a new constrained random walk with restart (CRWR) algorithm to improve Hi-C connectivity among these contigs. Extensive evaluations on mock and real metaHi-C datasets from diverse environments, including the human gut, wastewater, cow rumen, and sheep gut, demonstrate that ImputeCC consistently outperforms other Hi-C-based contig binning tools. ImputeCC’s genuslevel analysis of the sheep gut microbiota further reveals its ability and potential to recover essential species from dominant genera such as Bacteroides, detect previously unrecognized genera, and shed light on the characteristics and functional roles of genera such as Alistipes within the sheep gut ecosystem. Availability: ImputeCC is implemented in Python and available at https://github.com/dyxstat/ImputeCC. The Supplementary Information is available at https://doi.org/10.5281/zenodo.10776604. more »

Award ID(s):: 2125142

PAR ID:: 10558941

Author(s) / Creator(s):: Du, Yuxuan; Zuo, Wenxuan; Sun, Fengzhu

Editor(s):: Ma, Jian

Publisher / Repository:: Springer Nature

Date Published:: 2024-05-17

ISBN:: 978-1071639887

Page Range / eLocation ID:: 99-114

Subject(s) / Keyword(s):: Metagenomic Hi-C · Integrative Contig Binning · MetaHi-C Contact Map Imputation · Constrained Random Walk With Restart

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Proceeding:
The DOI is not currently available.

More Like this