De novo diploid genome assembly using long noisy reads

Nie, Fan; Ni, Peng (ORCID:0000000208017574); Huang, Neng; Zhang, Jun; Wang, Zhenyu; Xiao, Chuanle (ORCID:0000000246800682); Luo, Feng (ORCID:0000000248132403); Wang, Jianxin (ORCID:0000000315160480)

doi:10.1038/s41467-024-47349-7

Citation Details

De novo diploid genome assembly using long noisy reads

Abstract The high sequencing error rate has impeded the application of long noisy reads for diploid genome assembly. Most existing assemblers failed to generate high-quality phased assemblies using long noisy reads. Here, we present PECAT, aPhasedErrorCorrection andAssemblyTool, for reconstructing diploid genomes from long noisy reads. We design a haplotype-aware error correction method that can retain heterozygote alleles while correcting sequencing errors. We combine a corrected read SNP caller and a raw read SNP caller to further improve the identification of inconsistent overlaps in the string graph. We use a grouping method to assign reads to different haplotype groups. PECAT efficiently assembles diploid genomes using Nanopore R9, PacBio CLR or Nanopore R10 reads only. PECAT generates more contiguous haplotype-specific contigs compared to other assemblers. Especially, PECAT achieves nearly haplotype-resolved assembly onB. taurus(Bison×Simmental) using Nanopore R9 reads and phase block NG50 with 59.4/58.0 Mb for HG002 using Nanopore R10 reads. more »

Award ID(s):: 1759856 2025541

PAR ID:: 10499126

Author(s) / Creator(s):: Nie, Fan; Ni, Peng; Huang, Neng; Zhang, Jun; Wang, Zhenyu; Xiao, Chuanle; Luo, Feng; Wang, Jianxin

Publisher / Repository:: Nature Publishing Group

Date Published:: 2024-04-05

Journal Name:: Nature Communications

Volume:: 15

Issue:: 1

ISSN:: 2041-1723

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Journal Article:
https://doi.org/10.1038/s41467-024-47349-7

More Like this