Models and information-theoretic bounds for nanopore sequencing

Mao, Wei; Diggavi, Suhas; Kannan, Sreeram

doi:10.1109/ISIT.2017.8006971

Citation Details

Models and information-theoretic bounds for nanopore sequencing

Nanopore sequencing is an emerging new technology for sequencing DNA, which can read long fragments of DNA (∼50,000 bases) unlike most current sequencers which can only read hundreds of bases. While nanopore sequencers can acquire long reads, the high error rates (≈ 30%) pose a technical challenge. In a nanopore sequencer, a DNA is migrated through a nanopore and current variations are measured. The DNA sequence is inferred from this observed current pattern using an algorithm called a base-caller. In this paper, we propose a mathematical model for the “channel” from the input DNA sequence to the observed current, and calculate bounds on the information extraction capacity of the nanopore sequencer. This model incorporates impairments like inter-symbol interference, deletions, as well as random response. The practical application of such information bounds is two-fold: (1) benchmarking present base-calling algorithms, and (2) offering an optimization objective for designing better nanopore sequencers. more »

Award ID(s):: 1705077 1703403

PAR ID:: 10058580

Author(s) / Creator(s):: Mao, Wei; Diggavi, Suhas; Kannan, Sreeram

Date Published:: 2017-06-01

Journal Name:: IEEE International Symposium on Information Theory

Page Range / eLocation ID:: 2458 to 2462

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/ISIT.2017.8006971

More Like this