vAMPirus : A versatile amplicon processing and analysis program for studying viruses

Veglia, Alex_J  (ORCID:0000000331185127); Rivera‐Vicéns, Ramón_E  (ORCID:0000000262293537); Grupstra, Carsten_G_B  (ORCID:0000000150834570); Howe‐Kerr, Lauren_I  (ORCID:0000000280865869); Correa, Adrienne_M_S  (ORCID:0000000301375042)

doi:10.1111/1755-0998.13978

Abstract Amplicon sequencing is an effective and increasingly applied method for studying viral communities in the environment. Here, we present vAMPirus, a user‐friendly, comprehensive, and versatile DNA and RNA virus amplicon sequence analysis program, designed to support investigators in exploring virus amplicon sequencing data and running informed, reproducible analyses. vAMPirus intakes raw virus amplicon libraries and, by default, performs nucleotide‐ and amino acid‐based analyses to produce results such as sequence abundance information, taxonomic classifications, phylogenies and community diversity metrics. The vAMPirus analytical framework leverages 16 different opensource tools and provides optional approaches that can increase the ratio of biological signal‐to‐noise and thereby reveal patterns that would have otherwise been masked. Here, we validate the vAMPirus analytical framework and illustrate its implementation as a general virus amplicon sequencing workflow by recapitulating findings from two previously published double‐stranded DNA virus datasets. As a case study, we also apply the program to explore the diversity and distribution of a coral reef‐associated RNA virus. vAMPirus is streamlined within Nextflow, offering straightforward scalability, standardization and communication of virus lineage‐specific analyses. The vAMPirus framework is designed to be adaptable; community‐driven analytical standards will continue to be incorporated as the field advances. vAMPirus supports researchers in revealing patterns of virus diversity and population dynamics in nature, while promoting study reproducibility and comparability.

More Like this