ViralMSA: massively scalable reference-guided multiple sequence alignment of viral genomes

Moshiri, Niema

doi:10.1093/bioinformatics/btaa743

Citation Details

ViralMSA: massively scalable reference-guided multiple sequence alignment of viral genomes

Abstract Motivation In molecular epidemiology, the identification of clusters of transmissions typically requires the alignment of viral genomic sequence data. However, existing methods of multiple sequence alignment (MSA) scale poorly with respect to the number of sequences. Results ViralMSA is a user-friendly reference-guided MSA tool that leverages the algorithmic techniques of read mappers to enable the MSA of ultra-large viral genome datasets. It scales linearly with the number of sequences, and it is able to align tens of thousands of full viral genomes in seconds. However, alignments produced by ViralMSA omit insertions with respect to the reference genome. Availability and implementation ViralMSA is freely available at https://github.com/niemasd/ViralMSA as an open-source software project. Contact niema@ucsd.edu Supplementary information Supplementary data are available at Bioinformatics online. more »

Award ID(s):: 2028040

PAR ID:: 10219871

Author(s) / Creator(s):: Moshiri, Niema

Editor(s):: Robinson, Peter

Date Published:: 2020-08-19

Journal Name:: Bioinformatics

ISSN:: 1367-4803

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1093/bioinformatics/btaa743

More Like this