EDGE COVID-19: a web platform to generate submission-ready genomes from SARS-CoV-2 sequencing efforts

Lo, Chien-Chi (ORCID:0000000334701078); Shakya, Migun (ORCID:000000033876691X); Connor, Ryan; Davenport, Karen; Flynn, Mark; Gutiérrez, Adán Myers y. (ORCID:0000000234571632); Hu, Bin (ORCID:0000000202788466); Li, Po-E; Jackson, Elais Player; Xu, Yan; Chain, Patrick S. G.; Alkan, ed., Can

doi:10.1093/bioinformatics/btac176

Abstract Summary

Genomics has become an essential technology for surveilling emerging infectious disease outbreaks. A range of technologies and strategies for pathogen genome enrichment and sequencing are being used by laboratories worldwide, together with different and sometimes ad hoc, analytical procedures for generating genome sequences. A fully integrated analytical process for raw sequence to consensus genome determination, suited to outbreaks such as the ongoing COVID-19 pandemic, is critical to provide a solid genomic basis for epidemiological analyses and well-informed decision making. We have developed a web-based platform and integrated bioinformatic workflows that help to provide consistent high-quality analysis of SARS-CoV-2 sequencing data generated with either the Illumina or Oxford Nanopore Technologies (ONT). Using an intuitive web-based interface, this workflow automates data quality control, SARS-CoV-2 reference-based genome variant and consensus calling, lineage determination and provides the ability to submit the consensus sequence and necessary metadata to GenBank, GISAID and INSDC raw data repositories. We tested workflow usability using real world data and validated the accuracy of variant and lineage analysis using several test datasets, and further performed detailed comparisons with results from the COVID-19 Galaxy Project workflow. Our analyses indicate that EC-19 workflows generate high-quality SARS-CoV-2 genomes. Finally, we share a perspective on patterns and impact observed with Illumina versus ONT technologies on workflow congruence and differences.

Availability and implementation

https://edge-covid19.edgebioinformatics.org, and https://github.com/LANL-Bioinformatics/EDGE/tree/SARS-CoV2.

Supplementary information

Supplementary data are available at Bioinformatics online.

More Like this