skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The ViReflow pipeline enables user friendly large scale viral consensus genome reconstruction
Abstract Throughout the COVID-19 pandemic, massive sequencing and data sharing efforts enabled the real-time surveillance of novel SARS-CoV-2 strains throughout the world, the results of which provided public health officials with actionable information to prevent the spread of the virus. However, with great sequencing comes great computation, and while cloud computing platforms bring high-performance computing directly into the hands of all who seek it, optimal design and configuration of a cloud compute cluster requires significant system administration expertise. We developed ViReflow, a user-friendly viral consensus sequence reconstruction pipeline enabling rapid analysis of viral sequence datasets leveraging Amazon Web Services (AWS) cloud compute resources and the Reflow system. ViReflow was developed specifically in response to the COVID-19 pandemic, but it is general to any viral pathogen. Importantly, when utilized with sufficient compute resources, ViReflow can trim, map, call variants, and call consensus sequences from amplicon sequence data from 1000 SARS-CoV-2 samples at 1000X depth in < 10 min, with no user intervention. ViReflow’s simplicity, flexibility, and scalability make it an ideal tool for viral molecular epidemiological efforts.  more » « less
Award ID(s):
2028040 2038509
PAR ID:
10364337
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Scientific Reports
Volume:
12
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Pettigrew, Melinda M. (Ed.)
    ABSTRACT Viral genome sequencing has guided our understanding of the spread and extent of genetic diversity of SARS-CoV-2 during the COVID-19 pandemic. SARS-CoV-2 viral genomes are usually sequenced from nasopharyngeal swabs of individual patients to track viral spread. Recently, RT-qPCR of municipal wastewater has been used to quantify the abundance of SARS-CoV-2 in several regions globally. However, metatranscriptomic sequencing of wastewater can be used to profile the viral genetic diversity across infected communities. Here, we sequenced RNA directly from sewage collected by municipal utility districts in the San Francisco Bay Area to generate complete and nearly complete SARS-CoV-2 genomes. The major consensus SARS-CoV-2 genotypes detected in the sewage were identical to clinical genomes from the region. Using a pipeline for single nucleotide variant calling in a metagenomic context, we characterized minor SARS-CoV-2 alleles in the wastewater and detected viral genotypes which were also found within clinical genomes throughout California. Observed wastewater variants were more similar to local California patient-derived genotypes than they were to those from other regions within the United States or globally. Additional variants detected in wastewater have only been identified in genomes from patients sampled outside California, indicating that wastewater sequencing can provide evidence for recent introductions of viral lineages before they are detected by local clinical sequencing. These results demonstrate that epidemiological surveillance through wastewater sequencing can aid in tracking exact viral strains in an epidemic context. 
    more » « less
  2. The coronavirus disease 2019 (COVID-19) pandemic challenged the workings of human society, but in doing so, it advanced our understanding of the ecology and evolution of infectious diseases. Fluctuating transmission of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) demonstrated the highly dynamic nature of human social behavior, often without government intervention. Evolution of SARS-CoV-2 in the first two years following spillover resulted primarily in increased transmissibility, while in the third year, the globally dominant virus variants had all evolved substantial immune evasion. The combination of viral evolution and the buildup of host immunity through vaccination and infection greatly decreased the realized virulence of SARS-CoV-2 due to the age dependence of disease severity. The COVID-19 pandemic was exacerbated by presymptomatic, asymptomatic, and highly heterogeneous transmission, as well as highly variable disease severity and the broad host range of SARS-CoV-2. Insights and tools developed during the COVID-19 pandemic could provide a stronger scientific basis for preventing, mitigating, and controlling future pandemics. 
    more » « less
  3. NA (Ed.)
    The sequencing of human virus genomes from wastewater samples is an efficient method for tracking viral transmission and evolution at the community level. However, this requires the recovery of viral nucleic acids of high quality. We developed a reusable tangential-flow filtration system to concentrate and purify viruses from wastewater for genome sequencing. A pilot study was conducted with 94 wastewater samples from four local sewersheds, from which viral nucleic acids were extracted, and the whole genome of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was sequenced using the ARTIC V4.0 primers. Our method yielded a high probability (0.9) of recovering complete or near-complete SARS-CoV-2 genomes (>90% coverage at 10× depth) from wastewater when the COVID-19 incidence rate exceeded 33 cases per 100 000 people. The relative abundances of sequenced SARS-CoV-2 variants followed the trends observed from patient-derived samples. We also identified SARS-CoV-2 lineages in wastewater that were underrepresented or not present in the clinical whole-genome sequencing data. The developed tangential-flow filtration system can be easily adopted for the sequencing of other viruses in wastewater, particularly those at low concentrations. 
    more » « less
  4. null (Ed.)
    The transmission and evolution of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) are of paramount importance in controlling and combating the coronavirus disease 2019 (COVID-19) pandemic. Currently, over 15,000 SARS-CoV-2 single mutations have been recorded, which have a great impact on the development of diagnostics, vaccines, antibody therapies, and drugs. However, little is known about SARS-CoV-2’s evolutionary characteristics and general trend. In this work, we present a comprehensive genotyping analysis of existing SARS-CoV-2 mutations. We reveal that host immune response via APOBEC and ADAR gene editing gives rise to near 65% of recorded mutations. Additionally, we show that children under age five and the elderly may be at high risk from COVID-19 because of their overreaction to the viral infection. Moreover, we uncover that populations of Oceania and Africa react significantly more intensively to SARS-CoV-2 infection than those of Europe and Asia, which may explain why African Americans were shown to be at increased risk of dying from COVID-19, in addition to their high risk of COVID-19 infection caused by systemic health and social inequities. Finally, our study indicates that for two viral genome sequences of the same origin, their evolution order may be determined from the ratio of mutation type, C > T over T > C. 
    more » « less
  5. The COVID-19 pandemic has prompted an unprecedented global effort to understand and mitigate the spread of the SARS-CoV-2 virus. In this study, we present a comprehensive analysis of COVID-19 in Western New York (WNY), integrating individual patient-level genomic sequencing data with a spatially informed agent-based disease Susceptible-Exposed-Infectious-Recovered (SEIR) computational model. The integration of genomic and spatial data enables a multi-faceted exploration of the factors influencing the transmission patterns of COVID-19, including genetic variations in the viral genomes, population density, and movement dynamics in New York State (NYS). Our genomic analyses provide insights into the genetic heterogeneity of SARS-CoV-2 within a single lineage, at region-specific resolutions, while our population analyses provide models for SARS-CoV-2 lineage transmission. Together, our findings shed light on localized dynamics of the pandemic, revealing potential cross-county transmission networks. This interdisciplinary approach, bridging genomics and spatial modeling, contributes to a more comprehensive understanding of COVID-19 dynamics. The results of this study have implications for future public health strategies, including guiding targeted interventions and resource allocations to control the spread of similar viruses. 
    more » « less