Improved transcriptome assembly using a hybrid of long and short reads with StringTie

Shumate A, Wong B

doi:10.1371/journal.pcbi.1009730

Citation Details

Improved transcriptome assembly using a hybrid of long and short reads with StringTie

Short-read RNA sequencing and long-read RNA sequencing each have their strengths and weaknesses for transcriptome assembly. While short reads are highly accurate, they are rarely able to span multiple exons. Long-read technology can capture full-length transcripts, but its relatively high error rate often leads to mis-identified splice sites. Here we present a new release of StringTie that performs hybrid-read assembly. By taking advantage of the strengths of both long and short reads, hybrid-read assembly with StringTie is more accurate than long-read only or short-read only assembly, and on some datasets it can more than double the number of correctly assembled transcripts, while obtaining substantially higher precision than the long-read data assembly alone. Here we demonstrate the improved accuracy on simulated data and real data from Arabidopsis thaliana, Mus musculus, and human. We also show that hybrid-read assembly is more accurate than correcting long reads prior to assembly while also being substantially faster. StringTie is freely available as open source software at https://github.com/gpertea/stringtie. more »

Award ID(s):: 1759518

PAR ID:: 10332985

Author(s) / Creator(s):: Shumate A, Wong B

Date Published:: 2022-06-01

Journal Name:: PLoS computational biology

ISSN:: 1553-7358

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1371/journal.pcbi.1009730

More Like this