Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract Soil microorganisms are pivotal in the global carbon cycle, but the viruses that affect them and their impact on ecosystems are less understood. In this study, we explored the diversity, dynamics, and ecology of soil viruses through 379 metagenomes collected annually from 2010 to 2017. These samples spanned the seasonally thawed active layer of a permafrost thaw gradient, which included palsa, bog, and fen habitats. We identified 5051 virus operational taxonomic units (vOTUs), doubling the known viruses for this site. These vOTUs were largely ephemeral within habitats, suggesting a turnover at the vOTU level from year to year. While the diversity varied by thaw stage and depth‐related patterns were specific to each habitat, the virus communities did not significantly change over time. The abundance ratios of virus to host at the phylum level did not show consistent trends across the thaw gradient, depth, or time. To assess potential ecosystem impacts, we predicted hostsin silicoand found viruses linked to microbial lineages involved in the carbon cycle, such as methanotrophy and methanogenesis. This included the identification of viruses ofCandidatusMethanoflorens, a significant global methane contributor. We also detected a variety of potential auxiliary metabolic genes, including 24 carbon‐degrading glycoside hydrolases, six of which are uniquely terrestrial. In conclusion, these long‐term observations enhance our understanding of soil viruses in the context of climate‐relevant processes and provide opportunities to explore their role in terrestrial carbon cycling.more » « less
-
Abstract Our knowledge of viral sequence space has exploded with advancing sequencing technologies and large-scale sampling and analytical efforts. Though archaea are important and abundant prokaryotes in many systems, our knowledge of archaeal viruses outside of extreme environments is limited. This largely stems from the lack of a robust, high-throughput, and systematic way to distinguish between bacterial and archaeal viruses in datasets of curated viruses. Here we upgrade our prior text-based tool (MArVD) via training and testing a random forest machine learning algorithm against a newly curated dataset of archaeal viruses. After optimization, MArVD2 presented a significant improvement over its predecessor in terms of scalability, usability, and flexibility, and will allow user-defined custom training datasets as archaeal virus discovery progresses. Benchmarking showed that a model trained with viral sequences from the hypersaline, marine, and hot spring environments correctly classified 85% of the archaeal viruses with a false detection rate below 2% using a random forest prediction threshold of 80% in a separate benchmarking dataset from the same habitats.more » « less
-
null (Ed.)Abstract Background Viruses are a significant player in many biosphere and human ecosystems, but most signals remain “hidden” in metagenomic/metatranscriptomic sequence datasets due to the lack of universal gene markers, database representatives, and insufficiently advanced identification tools. Results Here, we introduce VirSorter2, a DNA and RNA virus identification tool that leverages genome-informed database advances across a collection of customized automatic classifiers to improve the accuracy and range of virus sequence detection. When benchmarked against genomes from both isolated and uncultivated viruses, VirSorter2 uniquely performed consistently with high accuracy (F1-score > 0.8) across viral diversity, while all other tools under-detected viruses outside of the group most represented in reference databases (i.e., those in the order Caudovirales ). Among the tools evaluated, VirSorter2 was also uniquely able to minimize errors associated with atypical cellular sequences including eukaryotic genomes and plasmids. Finally, as the virosphere exploration unravels novel viral sequences, VirSorter2’s modular design makes it inherently able to expand to new types of viruses via the design of new classifiers to maintain maximal sensitivity and specificity. Conclusion With multi-classifier and modular design, VirSorter2 demonstrates higher overall accuracy across major viral groups and will advance our knowledge of virus evolution, diversity, and virus-microbe interaction in various ecosystems. Source code of VirSorter2 is freely available ( https://bitbucket.org/MAVERICLab/virsorter2 ), and VirSorter2 is also available both on bioconda and as an iVirus app on CyVerse ( https://de.cyverse.org/de ).more » « less
-
Abstract Microbes drive myriad ecosystem processes, but under strong influence from viruses. Because studying viruses in complex systems requires different tools than those for microbes, they remain underexplored. To combat this, we previously aggregated double-stranded DNA (dsDNA) virus analysis capabilities and resources into ‘iVirus’ on the CyVerse collaborative cyberinfrastructure. Here we substantially expand iVirus’s functionality and accessibility, to iVirus 2.0, as follows. First, core iVirus apps were integrated into the Department of Energy’s Systems Biology KnowledgeBase (KBase) to provide an additional analytical platform. Second, at CyVerse, 20 software tools (apps) were upgraded or added as new tools and capabilities. Third, nearly 20-fold more sequence reads were aggregated to capture new data and environments. Finally, documentation, as “live” protocols, was updated to maximize user interaction with and contribution to infrastructure development. Together, iVirus 2.0 serves as a uniquely central and accessible analytical platform for studying how viruses, particularly dsDNA viruses, impact diverse microbial ecosystems.more » « less
-
null (Ed.)Background Viruses influence global patterns of microbial diversity and nutrient cycles. Though viral metagenomics (viromics), specifically targeting dsDNA viruses, has been critical for revealing viral roles across diverse ecosystems, its analyses differ in many ways from those used for microbes. To date, viromics benchmarking has covered read pre-processing, assembly, relative abundance, read mapping thresholds and diversity estimation, but other steps would benefit from benchmarking and standardization. Here we use in silico-generated datasets and an extensive literature survey to evaluate and highlight how dataset composition (i.e., viromes vs bulk metagenomes) and assembly fragmentation impact (i) viral contig identification tool, (ii) virus taxonomic classification, and (iii) identification and curation of auxiliary metabolic genes (AMGs). Results The in silico benchmarking of five commonly used virus identification tools show that gene-content-based tools consistently performed well for long (≥3 kbp) contigs, while k -mer- and blast-based tools were uniquely able to detect viruses from short (≤3 kbp) contigs. Notably, however, the performance increase of k -mer- and blast-based tools for short contigs was obtained at the cost of increased false positives (sometimes up to ∼5% for virome and ∼75% bulk samples), particularly when eukaryotic or mobile genetic element sequences were included in the test datasets. For viral classification, variously sized genome fragments were assessed using gene-sharing network analytics to quantify drop-offs in taxonomic assignments, which revealed correct assignations ranging from ∼95% (whole genomes) down to ∼80% (3 kbp sized genome fragments). A similar trend was also observed for other viral classification tools such as VPF-class, ViPTree and VIRIDIC, suggesting that caution is warranted when classifying short genome fragments and not full genomes. Finally, we highlight how fragmented assemblies can lead to erroneous identification of AMGs and outline a best-practices workflow to curate candidate AMGs in viral genomes assembled from metagenomes. Conclusion Together, these benchmarking experiments and annotation guidelines should aid researchers seeking to best detect, classify, and characterize the myriad viruses ‘hidden’ in diverse sequence datasets.more » « less
-
Abstract Viruses play an important role in the ecology and biogeochemistry of marine ecosystems. Beyond mortality and gene transfer, viruses can reprogram microbial metabolism during infection by expressing auxiliary metabolic genes (AMGs) involved in photosynthesis, central carbon metabolism, and nutrient cycling. While previous studies have focused on AMG diversity in the sunlit and dark ocean, less is known about the role of viruses in shaping metabolic networks along redox gradients associated with marine oxygen minimum zones (OMZs). Here, we analyzed relatively quantitative viral metagenomic datasets that profiled the oxygen gradient across Eastern Tropical South Pacific (ETSP) OMZ waters, assessing whether OMZ viruses might impact nitrogen (N) cycling via AMGs. Identified viral genomes encoded six N-cycle AMGs associated with denitrification, nitrification, assimilatory nitrate reduction, and nitrite transport. The majority of these AMGs (80%) were identified in T4-like Myoviridae phages, predicted to infect Cyanobacteria and Proteobacteria, or in unclassified archaeal viruses predicted to infect Thaumarchaeota. Four AMGs were exclusive to anoxic waters and had distributions that paralleled homologous microbial genes. Together, these findings suggest viruses modulate N-cycling processes within the ETSP OMZ and may contribute to nitrogen loss throughout the global oceans thus providing a baseline for their inclusion in the ecosystem and geochemical models.more » « less
-
null (Ed.)Abstract Microbial and viral communities transform the chemistry of Earth's ecosystems, yet the specific reactions catalyzed by these biological engines are hard to decode due to the absence of a scalable, metabolically resolved, annotation software. Here, we present DRAM (Distilled and Refined Annotation of Metabolism), a framework to translate the deluge of microbiome-based genomic information into a catalog of microbial traits. To demonstrate the applicability of DRAM across metabolically diverse genomes, we evaluated DRAM performance on a defined, in silico soil community and previously published human gut metagenomes. We show that DRAM accurately assigned microbial contributions to geochemical cycles and automated the partitioning of gut microbial carbohydrate metabolism at substrate levels. DRAM-v, the viral mode of DRAM, established rules to identify virally-encoded auxiliary metabolic genes (AMGs), resulting in the metabolic categorization of thousands of putative AMGs from soils and guts. Together DRAM and DRAM-v provide critical metabolic profiling capabilities that decipher mechanisms underpinning microbiome function.more » « less
-
Summary Oxygen minimum zones (OMZs) are critical to marine nitrogen cycling and global climate change. While OMZ microbial communities are relatively well‐studied, little is known about their viruses. Here, we assess the viral community ecology of 22 deeply sequenced viral metagenomes along a gradient of oxygenated to anoxic waters (<0.02 μmol/l O2) in the Eastern Tropical South Pacific (ETSP) OMZ. We identified 46 127 viral populations (≥5 kb), which augments the known viruses from ETSP by 10‐fold. Viral communities clustered into six groups that correspond to oceanographic features. Oxygen concentration was the predominant environmental feature driving viral community structure. Alpha and beta diversity of viral communities in the anoxic zone were lower than in surface waters, which parallels the low microbial diversity seen in other studies. ETSP viruses were largely endemic, with the majority of shared viruses (87%) also present in other OMZ samples. We detected 543 putative viral‐encoded auxiliary metabolic genes (AMGs), of which some have a distribution that reflects physico‐chemical characteristics across depth. Together these findings provide an ecological baseline for viral community structure, drivers and population variability in OMZs that will help future studies assess the role of viruses in these climate‐critical environments.more » « less