  1. Viruses are the most abundant and diverse biological entities on the planet and constitute a significant proportion of Earth’s genetic diversity. Most of this diversity is not represented by isolated viral-host systems and has only been observed through sequencing of viral metagenomes (viromes) from environmental samples. Viromes provide snapshots of viral genetic potential, and a wealth of information on viral community ecology. These data also provide opportunities for exploring the biochemistry of novel viral enzymes. The in vitro biochemical characteristics of novel viral DNA polymerases were explored, testing hypothesized differences in polymerase biochemistry according to protein sequence phylogeny. Forty-eight viral DNA Polymerase I (PolA) proteins from estuarine viromes, hot spring metagenomes, and reference viruses, encompassing a broad representation of currently known diversity, were synthesized, expressed, and purified. Novel functionality was shown in multiple PolAs. Intriguingly, some of the estuarine viral polymerases demonstrated moderate to strong innate DNA strand displacement activity at high enzyme concentration. Strand-displacing polymerases have important technological applications where isothermal reactions are desirable. Bioinformatic investigation of genes neighboring these strand displacing polymerases found associations with SNF2 helicase-associated proteins. The specific function of SNF2 family enzymes is unknown for prokaryotes and viruses. In eukaryotes, SNF2 enzymes have chromatin remodelingmore »functions but do not separate nucleic acid strands. This suggests the strand separation function may be fulfilled by the DNA polymerase for viruses carrying SNF2 helicase-associated proteins. Biochemical data elucidated from this study expands understanding of the biology and ecological behavior of unknown viruses. Moreover, given the numerous biotechnological applications of viral DNA polymerases, novel viral polymerases discovered within viromes may be a rich source of biological material for further in vitro DNA amplification advancements.« less
    Free, publicly-accessible full text available April 21, 2023
  2. Phylogenetic trees are an important analytical tool for evaluating community diversity and evolutionary history. In the case of microorganisms, the decreasing cost of sequencing has enabled researchers to generate ever-larger sequence datasets, which in turn have begun to fill gaps in the evolutionary history of microbial groups. However, phylogenetic analyses of these types of datasets create complex trees that can be challenging to interpret. Scientific inferences made by visual inspection of phylogenetic trees can be simplified and enhanced by customizing various parts of the tree. Yet, manual customization is time-consuming and error prone, and programs designed to assist in batch tree customization often require programming experience or complicated file formats for annotation. Iroki, a user-friendly web interface for tree visualization, addresses these issues by providing automatic customization of large trees based on metadata contained in tab-separated text files. Iroki’s utility for exploring biological and ecological trends in sequencing data was demonstrated through a variety of microbial ecology applications in which trees with hundreds to thousands of leaf nodes were customized according to extensive collections of metadata. The Iroki web application and documentation are available at or through the VIROME portal . Iroki’s source code is released under themore »MIT license and is available at .« less
  3. ABSTRACT Viral infection exerts selection pressure on marine microbes, as virus-induced cell lysis causes 20 to 50% of cell mortality, resulting in fluxes of biomass into oceanic dissolved organic matter. Archaeal and bacterial populations can defend against viral infection using the clustered regularly interspaced short palindromic repeat (CRISPR)-associated (Cas) system, which relies on specific matching between a spacer sequence and a viral gene. If a CRISPR spacer match to any gene within a viral genome is equally effective in preventing lysis, no viral genes should be preferentially matched by CRISPR spacers. However, if there are differences in effectiveness, certain viral genes may demonstrate a greater frequency of CRISPR spacer matches. Indeed, homology search analyses of bacterioplankton CRISPR spacer sequences against virioplankton sequences revealed preferential matching of replication proteins, nucleic acid binding proteins, and viral structural proteins. Positive selection pressure for effective viral defense is one parsimonious explanation for these observations. CRISPR spacers from virioplankton metagenomes preferentially matched methyltransferase and phage integrase genes within virioplankton sequences. These virioplankton CRISPR spacers may assist infected host cells in defending against competing phage. Analyses also revealed that half of the spacer-matched viral genes were unknown, some genes matched several spacers, and some spacers matchedmore »multiple genes, a many-to-many relationship. Thus, CRISPR spacer matching may be an evolutionary algorithm, agnostically identifying those genes under stringent selection pressure for sustaining viral infection and lysis. Investigating this subset of viral genes could reveal those genetic mechanisms essential to virus-host interactions and provide new technologies for optimizing CRISPR defense in beneficial microbes. IMPORTANCE The CRISPR-Cas system is one means by which bacterial and archaeal populations defend against viral infection which causes 20 to 50% of cell mortality in the ocean. We tested the hypothesis that certain viral genes are preferentially targeted for the initial attack of the CRISPR-Cas system on a viral genome. Using CASC, a pipeline for CRISPR spacer discovery, and metagenome data from oceanic microbes and viruses, we found a clear subset of viral genes with high match frequencies to CRISPR spacers. Moreover, we observed a many-to-many relationship of spacers and viral genes. These high-match viral genes were involved in nucleotide metabolism, DNA methylation, and viral structure. It is possible that CRISPR spacer matching is an evolutionary algorithm pointing to those viral genes most important to sustaining infection and lysis. Studying these genes may advance the understanding of virus-host interactions in nature and provide new technologies for leveraging CRISPR-Cas systems in beneficial microbes.« less