skip to main content

Title: Towards quantitative viromics for both double-stranded and single-stranded DNA viruses

Viruses strongly influence microbial population dynamics and ecosystem functions. However, our ability to quantitatively evaluate those viral impacts is limited to the few cultivated viruses and double-stranded DNA (dsDNA) viral genomes captured in quantitative viral metagenomes (viromes). This leaves the ecology of non-dsDNA viruses nearly unknown, including single-stranded DNA (ssDNA) viruses that have been frequently observed in viromes, but not quantified due to amplification biases in sequencing library preparations (Multiple Displacement Amplification, Linker Amplification or Tagmentation).


Here we designed mock viral communities including both ssDNA and dsDNA viruses to evaluate the capability of a sequencing library preparation approach including an Adaptase step prior to Linker Amplification for quantitative amplification of both dsDNA and ssDNA templates. We then surveyed aquatic samples to provide first estimates of the abundance of ssDNA viruses.


Mock community experiments confirmed the biased nature of existing library preparation methods for ssDNA templates (either largely enriched or selected against) and showed that the protocol using Adaptase plus Linker Amplification yielded viromes that were ±1.8-fold quantitative for ssDNA and dsDNA viruses. Application of this protocol to community virus DNA from three freshwater and three marine samples revealed that ssDNA viruses as a whole represent only a minor fraction (<5%) more » of DNA virus communities, though individual ssDNA genomes, both eukaryote-infecting Circular Rep-Encoding Single-Stranded DNA (CRESS-DNA) viruses and bacteriophages from theMicroviridaefamily, can be among the most abundant viral genomes in a sample.


Together these findings provide empirical data for a new virome library preparation protocol, and a first estimate of ssDNA virus abundance in aquatic systems.

« less
 ;  ;  ;  ;  ;  ;  ;  ;  
Publication Date:
Journal Name:
Page Range or eLocation-ID:
Sponsoring Org:
National Science Foundation
More Like this
  1. Sandri-Goldin, Rozanne M. (Ed.)
    ABSTRACT Most icosahedral viruses condense their genomes into volumetrically constrained capsids. However, concurrent genome biosynthesis and packaging are specific to single-stranded DNA (ssDNA) viruses. ssDNA genome packaging combines elements found in both double-stranded DNA (dsDNA) and ssRNA systems. Similar to dsDNA viruses, the genome is packaged into a preformed capsid. Like ssRNA viruses, there are numerous capsid-genome associations. In ssDNA microviruses, the DNA-binding protein J guides the genome between 60 icosahedrally ordered DNA binding pockets. It also partially neutralizes the DNA’s negative phosphate backbone. ϕX174-related microviruses, such as G4 and α3, have J proteins that differ in length and charge organization. This suggests that interchanging J proteins could alter the path used to guide DNA in the capsid. Previously, a ϕXG4J chimera, in which the ϕX174 J gene was replaced with the G4 gene, was characterized. It displayed lethal packaging defects, which resulted in procapsids being removed from productive assembly. Here, we report the characterization of another inviable chimera, ϕXα3J. Unlike ϕXG4J, ϕXα3J efficiently packaged DNA but produced noninfectious particles. These particles displayed a reduced ability to attach to host cells, suggesting that internal DNA organization could distort the capsid’s outer surface. Mutations that restored viability altered J-coat protein contactmore »sites. These results provide evidence that the organization of ssDNA can affect both packaging and postpackaging phenomena. IMPORTANCE ssDNA viruses utilize icosahedrally ordered protein-nucleic acids interactions to guide and organize their genomes into preformed shells. As previously demonstrated, chaotic genome-capsid associations can inhibit ϕX174 packaging by destabilizing packaging complexes. However, the consequences of poorly organized genomes may extend beyond the packaging reaction. As demonstrated herein, it can lead to uninfectious packaged particles. Thus, ssDNA genomes should be considered an integral and structural virion component, affecting the properties of the entire particle, which includes the capsid’s outer surface.« less
  2. Phages (viruses that infect bacteria) play important roles in the gut ecosystem through infection of bacterial hosts, yet the gut virome remains poorly characterized. Mammalian gut viromes are dominated by double-stranded DNA (dsDNA) phages belonging to the order Caudovirales and single-stranded DNA (ssDNA) phages belonging to the family Microviridae. Since the relative proportion of each of these phage groups appears to correlate with age and health status in humans, it is critical to understand both ssDNA and dsDNA phages in the gut. Building upon prior research describing dsDNA viruses in the gut of Ciona robusta, a marine invertebrate model system used to study gut microbial interactions, this study investigated ssDNA phages found in the Ciona gut. We identified 258 Microviridae genomes, which were dominated by novel members of the Gokushovirinae subfamily, but also represented several proposed phylogenetic groups (Alpavirinae, Aravirinae, Group D, Parabacteroides prophages, and Pequeñovirus) and a novel group. Comparative analyses between Ciona specimens with full and cleared guts, as well as the surrounding water, indicated that Ciona retains a distinct and highly diverse community of ssDNA phages. This study significantly expands the known diversity within the Microviridae family and demonstrates the promise of Ciona as a model systemmore »for investigating their role in animal health.« less
  3. Abstract Microbial communities are critical to ecosystem dynamics and biogeochemical cycling in the open oceans. Viruses are essential elements of these communities, influencing the productivity, diversity, and evolution of cellular hosts. To further explore the natural history and ecology of open-ocean viruses, we surveyed the spatiotemporal dynamics of double-stranded DNA (dsDNA) viruses in both virioplankton and bacterioplankton size fractions in the North Pacific Subtropical Gyre, one of the largest biomes on the planet. Assembly and clustering of viral genomes revealed a peak in virioplankton diversity at the base of the euphotic zone, where virus populations and host species richness both reached their maxima. Simultaneous characterization of both extracellular and intracellular viruses suggested depth-specific reproductive strategies. In particular, analyses indicated elevated lytic interactions in the mixed layer, more temporally variable temperate phage interactions at the base of the euphotic zone, and increased lysogeny in the mesopelagic ocean. Furthermore, the depth variability of auxiliary metabolic genes suggested habitat-specific strategies for viral influence on light-energy, nitrogen, and phosphorus acquisition during host infection. Most virus populations were temporally persistent over several years in this environment at the 95% nucleic acid identity level. In total, our analyses revealed variable distributional patterns and diverse reproductive and metabolicmore »strategies of virus populations in the open-ocean water column.« less
  4. Abstract BACKGROUND

    Despite widespread interest in next-generation sequencing (NGS), the adoption of personalized clinical genomics and mutation profiling of cancer specimens is lagging, in part because of technical limitations. Tumors are genetically heterogeneous and often contain normal/stromal cells, features that lead to low-abundance somatic mutations that generate ambiguous results or reside below NGS detection limits, thus hindering the clinical sensitivity/specificity standards of mutation calling. We applied COLD-PCR (coamplification at lower denaturation temperature PCR), a PCR methodology that selectively enriches variants, to improve the detection of unknown mutations before NGS-based amplicon resequencing.


    We used both COLD-PCR and conventional PCR (for comparison) to amplify serially diluted mutation-containing cell-line DNA diluted into wild-type DNA, as well as DNA from lung adenocarcinoma and colorectal cancer samples. After amplification of TP53 (tumor protein p53), KRAS (v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog), IDH1 [isocitrate dehydrogenase 1 (NADP+), soluble], and EGFR (epidermal growth factor receptor) gene regions, PCR products were pooled for library preparation, bar-coded, and sequenced on the Illumina HiSeq 2000.


    In agreement with recent findings, sequencing errors by conventional targeted-amplicon approaches dictated a mutation-detection limit of approximately 1%–2%. Conversely, COLD-PCR amplicons enriched mutations above the error-related noise, enabling reliable identification of mutation abundances of approximatelymore »0.04%. Sequencing depth was not a large factor in the identification of COLD-PCR–enriched mutations. For the clinical samples, several missense mutations were not called with conventional amplicons, yet they were clearly detectable with COLD-PCR amplicons. Tumor heterogeneity for the TP53 gene was apparent.


    As cancer care shifts toward personalized intervention based on each patient's unique genetic abnormalities and tumor genome, we anticipate that COLD-PCR combined with NGS will elucidate the role of mutations in tumor progression, enabling NGS-based analysis of diverse clinical specimens within clinical practice.

    « less
  5. Background Viruses influence global patterns of microbial diversity and nutrient cycles. Though viral metagenomics (viromics), specifically targeting dsDNA viruses, has been critical for revealing viral roles across diverse ecosystems, its analyses differ in many ways from those used for microbes. To date, viromics benchmarking has covered read pre-processing, assembly, relative abundance, read mapping thresholds and diversity estimation, but other steps would benefit from benchmarking and standardization. Here we use in silico-generated datasets and an extensive literature survey to evaluate and highlight how dataset composition (i.e., viromes vs bulk metagenomes) and assembly fragmentation impact (i) viral contig identification tool, (ii) virus taxonomic classification, and (iii) identification and curation of auxiliary metabolic genes (AMGs). Results The in silico benchmarking of five commonly used virus identification tools show that gene-content-based tools consistently performed well for long (≥3 kbp) contigs, while k -mer- and blast-based tools were uniquely able to detect viruses from short (≤3 kbp) contigs. Notably, however, the performance increase of k -mer- and blast-based tools for short contigs was obtained at the cost of increased false positives (sometimes up to ∼5% for virome and ∼75% bulk samples), particularly when eukaryotic or mobile genetic element sequences were included in the test datasets.more »For viral classification, variously sized genome fragments were assessed using gene-sharing network analytics to quantify drop-offs in taxonomic assignments, which revealed correct assignations ranging from ∼95% (whole genomes) down to ∼80% (3 kbp sized genome fragments). A similar trend was also observed for other viral classification tools such as VPF-class, ViPTree and VIRIDIC, suggesting that caution is warranted when classifying short genome fragments and not full genomes. Finally, we highlight how fragmented assemblies can lead to erroneous identification of AMGs and outline a best-practices workflow to curate candidate AMGs in viral genomes assembled from metagenomes. Conclusion Together, these benchmarking experiments and annotation guidelines should aid researchers seeking to best detect, classify, and characterize the myriad viruses ‘hidden’ in diverse sequence datasets.« less