skip to main content

Title: CRISPR Spacers Indicate Preferential Matching of Specific Virioplankton Genes
ABSTRACT Viral infection exerts selection pressure on marine microbes, as virus-induced cell lysis causes 20 to 50% of cell mortality, resulting in fluxes of biomass into oceanic dissolved organic matter. Archaeal and bacterial populations can defend against viral infection using the clustered regularly interspaced short palindromic repeat (CRISPR)-associated (Cas) system, which relies on specific matching between a spacer sequence and a viral gene. If a CRISPR spacer match to any gene within a viral genome is equally effective in preventing lysis, no viral genes should be preferentially matched by CRISPR spacers. However, if there are differences in effectiveness, certain viral genes may demonstrate a greater frequency of CRISPR spacer matches. Indeed, homology search analyses of bacterioplankton CRISPR spacer sequences against virioplankton sequences revealed preferential matching of replication proteins, nucleic acid binding proteins, and viral structural proteins. Positive selection pressure for effective viral defense is one parsimonious explanation for these observations. CRISPR spacers from virioplankton metagenomes preferentially matched methyltransferase and phage integrase genes within virioplankton sequences. These virioplankton CRISPR spacers may assist infected host cells in defending against competing phage. Analyses also revealed that half of the spacer-matched viral genes were unknown, some genes matched several spacers, and some spacers matched multiple genes, a many-to-many relationship. Thus, CRISPR spacer matching may be an evolutionary algorithm, agnostically identifying those genes under stringent selection pressure for sustaining viral infection and lysis. Investigating this subset of viral genes could reveal those genetic mechanisms essential to virus-host interactions and provide new technologies for optimizing CRISPR defense in beneficial microbes. IMPORTANCE The CRISPR-Cas system is one means by which bacterial and archaeal populations defend against viral infection which causes 20 to 50% of cell mortality in the ocean. We tested the hypothesis that certain viral genes are preferentially targeted for the initial attack of the CRISPR-Cas system on a viral genome. Using CASC, a pipeline for CRISPR spacer discovery, and metagenome data from oceanic microbes and viruses, we found a clear subset of viral genes with high match frequencies to CRISPR spacers. Moreover, we observed a many-to-many relationship of spacers and viral genes. These high-match viral genes were involved in nucleotide metabolism, DNA methylation, and viral structure. It is possible that CRISPR spacer matching is an evolutionary algorithm pointing to those viral genes most important to sustaining infection and lysis. Studying these genes may advance the understanding of virus-host interactions in nature and provide new technologies for leveraging CRISPR-Cas systems in beneficial microbes.  more » « less
Award ID(s):
1736030 1148118 1356374
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. ABSTRACT Anti-CRISPR (Acr) loci/operons encode Acr proteins and Acr-associated (Aca) proteins. Forty-five Acr families have been experimentally characterized inhibiting seven subtypes of CRISPR-Cas systems. We have developed a bioinformatics pipeline to identify genomic loci containing Acr homologs and/or Aca homologs by combining three computational approaches: homology, guilt-by-association, and self-targeting spacers. Homology search found thousands of Acr homologs in bacterial and viral genomes, but most are homologous to AcrIIA7 and AcrIIA9. Investigating the gene neighborhood of these Acr homologs revealed that only a small percentage (23.0% in bacteria and 8.2% in viruses) of them have neighboring Aca homologs and thus form Acr-Aca operons. Surprisingly, although a self-targeting spacer is a strong indicator of the presence of Acr genes in a genome, a large percentage of Acr-Aca loci are found in bacterial genomes without self-targeting spacers or even without complete CRISPR-Cas systems. Additionally, for Acr homologs from genomes with self-targeting spacers, homology-based Acr family assignments do not always agree with the self-targeting CRISPR-Cas subtypes. Last, by investigating Acr genomic loci coexisting with self-targeting spacers in the same genomes, five known subtypes (I-C, I-E, I-F, II-A, and II-C) and five new subtypes (I-B, III-A, III-B, IV-A, and V-U4) of Acrs were inferred. Based on these findings, we conclude that the discovery of new anti-CRISPRs should not be restricted to genomes with self-targeting spacers and loci with Acr homologs. The evolutionary arms race of CRISPR-Cas systems and anti-CRISPR systems may have driven the adaptive and rapid gain and loss of these elements in closely related genomes. IMPORTANCE As a naturally occurring adaptive immune system, CRISPR-Cas (clustered regularly interspersed short palindromic repeats–CRISPR-associated genes) systems are widely found in bacteria and archaea to defend against viruses. Since 2013, the application of various bacterial CRISPR-Cas systems has become very popular due to their development into targeted and programmable genome engineering tools with the ability to edit almost any genome. As the natural off-switch of CRISPR-Cas systems, anti-CRISPRs have a great potential to serve as regulators of CRISPR-Cas tools and enable safer and more controllable genome editing. This study will help understand the relative usefulness of the three bioinformatics approaches for new Acr discovery, as well as guide the future development of new bioinformatics tools to facilitate anti-CRISPR research. The thousands of Acr homologs and hundreds of new anti-CRISPR loci identified in this study will be a valuable data resource for genome engineers to search for new CRISPR-Cas regulators. 
    more » « less
  2. Chia, Nicholas (Ed.)
    ABSTRACT A diversity of clustered regularly interspaced short palindromic repeat (CRISPR)-Cas systems provide adaptive immunity to bacteria and archaea through recording “memories” of past viral infections. Recently, many novel CRISPR-associated proteins have been discovered via computational studies, but those studies relied on biased and incomplete databases of assembled genomes. We avoided these biases and applied a network theory approach to search for novel CRISPR-associated genes by leveraging subtle ecological cooccurrence patterns identified from environmental metagenomes. We validated our method using existing annotations and discovered 32 novel CRISPR-associated gene families. These genes span a range of putative functions, with many potentially regulating the response to infection. IMPORTANCE Every branch on the tree of life, including microbial life, faces the threat of viral pathogens. Over the course of billions of years of coevolution, prokaryotes have evolved a great diversity of strategies to defend against viral infections. One of these is the CRISPR adaptive immune system, which allows microbes to “remember” past infections in order to better fight them in the future. There has been much interest among molecular biologists in CRISPR immunity because this system can be repurposed as a tool for precise genome editing. Recently, a number of comparative genomics approaches have been used to detect novel CRISPR-associated genes in databases of genomes with great success, potentially leading to the development of new genome-editing tools. Here, we developed novel methods to search for these distinct classes of genes directly in environmental samples (“metagenomes”), thus capturing a more complete picture of the natural diversity of CRISPR-associated genes. 
    more » « less
  3. Introduction Planktothrix agardhii is a microcystin-producing cyanobacterium found in Sandusky Bay, a shallow and turbid embayment of Lake Erie. Previous work in other systems has indicated that cyanophages are an important natural control factor of harmful algal blooms. Currently, there are few cyanophages that are known to infect P. agardhii , with the best-known being PaV-LD, a tail-less cyanophage isolated from Lake Donghu, China. Presented here is a molecular characterization of Planktothrix specific cyanophages in Sandusky Bay. Methods and Results Putative Planktothrix -specific viral sequences from metagenomic data from the bay in 2013, 2018, and 2019 were identified by two approaches: homology to known phage PaV-LD, or through matching CRISPR spacer sequences with Planktothrix host genomes. Several contigs were identified as having viral signatures, either related to PaV-LD or potentially novel sequences. Transcriptomic data from 2015, 2018, and 2019 were also employed for the further identification of cyanophages, as well as gene expression of select viral sequences. Finally, viral quantification was tested using qPCR in 2015–2019 for PaV-LD like cyanophages to identify the relationship between presence and gene expression of these cyanophages. Notably, while PaV-LD like cyanophages were in high abundance over the course of multiple years (qPCR), transcriptomic analysis revealed only low levels of viral gene expression. Discussion This work aims to provide a broader understanding of Planktothrix cyanophage diversity with the goals of teasing apart the role of cyanophages in the control and regulation of harmful algal blooms and designing monitoring methodology for potential toxin-releasing lysis events. 
    more » « less
  4. Hatfull, Graham F. (Ed.)
    ABSTRACT Bacteria and bacteriophages (phages) have evolved potent defense and counterdefense mechanisms that allowed their survival and greatest abundance on Earth. CRISPR (clustered regularly interspaced short palindromic repeat)-Cas (CRISPR-associated) is a bacterial defense system that inactivates the invading phage genome by introducing double-strand breaks at targeted sequences. While the mechanisms of CRISPR defense have been extensively investigated, the counterdefense mechanisms employed by phages are poorly understood. Here, we report a novel counterdefense mechanism by which phage T4 restores the genomes broken by CRISPR cleavages. Catalyzed by the phage-encoded recombinase UvsX, this mechanism pairs very short stretches of sequence identity (minihomology sites), as few as 3 or 4 nucleotides in the flanking regions of the cleaved site, allowing replication, repair, and stitching of genomic fragments. Consequently, a series of deletions are created at the targeted site, making the progeny genomes completely resistant to CRISPR attack. Our results demonstrate that this is a general mechanism operating against both type II (Cas9) and type V (Cas12a) CRISPR-Cas systems. These studies uncovered a new type of counterdefense mechanism evolved by T4 phage where subtle functional tuning of preexisting DNA metabolism leads to profound impact on phage survival. IMPORTANCE Bacteriophages (phages) are viruses that infect bacteria and use them as replication factories to assemble progeny phages. Bacteria have evolved powerful defense mechanisms to destroy the invading phages by severing their genomes soon after entry into cells. We discovered a counterdefense mechanism evolved by phage T4 to stitch back the broken genomes and restore viral infection. In this process, a small amount of genetic material is deleted or another mutation is introduced, making the phage resistant to future bacterial attack. The mutant virus might also gain survival advantages against other restriction conditions or DNA damaging events. Thus, bacterial attack not only triggers counterdefenses but also provides opportunities to generate more fit phages. Such defense and counterdefense mechanisms over the millennia led to the extraordinary diversity and the greatest abundance of bacteriophages on Earth. Understanding these mechanisms will open new avenues for engineering recombinant phages for biomedical applications. 
    more » « less
  5. Abstract

    Through infection and lysis of their coexisting bacterial hosts, viruses impact the biogeochemical cycles sustaining globally significant pelagic oceanic ecosystems. Currently, little is known of the ecological interactions between lytic viruses and their bacterial hosts underlying these biogeochemical impacts at ecosystem scales. This study focused on populations of lytic viruses carrying the B12-dependent Class II monomeric ribonucleotide reductase (RNR) gene, ribonucleotide-triphosphate reductase (Class II RTPR), documenting seasonal changes in pelagic virioplankton and bacterioplankton using amplicon sequences of Class II RTPR and the 16S rRNA gene, respectively. Amplicon sequence libraries were analyzed using compositional data analysis tools that account for the compositional nature of these data. Both virio- and bacterioplankton communities responded to environmental changes typically seen across seasonal cycles as well as shorter term upwelling–downwelling events. Defining Class II RTPR-carrying viral populations according to major phylogenetic clades proved a more robust means of exploring virioplankton ecology than operational taxonomic units defined by percent sequence homology. Virioplankton Class II RTPR populations showed positive associations with a broad phylogenetic diversity of bacterioplankton including dominant taxa within pelagic oceanic ecosystems such as Prochlorococcus and SAR11. Temporal changes in Class II RTPR virioplankton, occurring as both free viruses and within infected cells, indicated possible viral–host pairs undergoing sustained infection and lysis cycles throughout the seasonal study. Phylogenetic relationships inferred from Class II RTPR sequences mirrored ecological patterns in virio- and bacterioplankton populations demonstrating possible genome to phenome associations for an essential viral replication gene.

    more » « less