nPoRe: n-polymer realigner for improved pileup-based variant calling

Dunn, Tim (ORCID:0000000334294329); Blaauw, David; Das, Reetuparna; Narayanasamy, Satish

doi:10.1186/s12859-023-05193-4

Citation Details

nPoRe: n-polymer realigner for improved pileup-based variant calling

Abstract

Despite recent improvements in nanopore basecalling accuracy, germline variant calling of small insertions and deletions (INDELs) remains poor. Although precision and recall for single nucleotide polymorphisms (SNPs) now exceeds 99.5%, INDEL recall remains below 80% for standard R9.4.1 flow cells. We show that read phasing and realignment can recover a significant portion of false negative INDELs. In particular, we extend Needleman-Wunsch affine gap alignment by introducing new gap penalties for more accurately aligning repeatedn-polymer sequences such as homopolymers ($$n=1$$ $n = 1$ ) and tandem repeats ($$2 \le n \le 6$$ $2 \leq n \leq 6$ ). At the same precision, haplotype phasing improves INDEL recall from 63.76 to$$70.66\%$$ $70.66 %$ and nPoRe realignment improves it further to$$73.04\%$$ $73.04 %$ .

NSF-PAR ID:: 10403258

Author(s) / Creator(s):: Dunn, Tim; Blaauw, David; Das, Reetuparna; Narayanasamy, Satish

Publisher / Repository:: Springer Science + Business Media

Date Published:: 2023-03-16

Journal Name:: BMC Bioinformatics

Volume:: 24

Issue:: 1

ISSN:: 1471-2105

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Journal Article:
https://doi.org/10.1186/s12859-023-05193-4

More Like this