skip to main content

Title: Structural coordination between active sites of a CRISPR reverse transcriptase-integrase complex

CRISPR-Cas systems provide adaptive immunity in bacteria and archaea, beginning with integration of foreign sequences into the host CRISPR genomic locus and followed by transcription and maturation of CRISPR RNAs (crRNAs). In some CRISPR systems, a reverse transcriptase (RT) fusion to the Cas1 integrase and Cas6 maturase creates a single protein that enables concerted sequence integration and crRNA production. To elucidate how the RT-integrase organizes distinct enzymatic activities, we present the cryo-EM structure of a Cas6-RT-Cas1—Cas2 CRISPR integrase complex. The structure reveals a heterohexamer in which the RT directly contacts the integrase and maturase domains, suggesting functional coordination between all three active sites. Together with biochemical experiments, our data support a model of sequential enzymatic activities that enable CRISPR sequence acquisition from RNA and DNA substrates. These findings highlight an expanded capacity of some CRISPR systems to acquire diverse sequences that direct CRISPR-mediated interference.

; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Nature Communications
Nature Publishing Group
Sponsoring Org:
National Science Foundation
More Like this
  1. CRISPR-Cas9 is an RNA-guided DNA endonuclease involved in bacterial adaptive immunity and widely repurposed for genome editing in human cells, animals and plants. In bacteria, RNA molecules that guide Cas9′s activity derive from foreign DNA fragments that are captured and integrated into the host CRISPR genomic locus by the Cas1-Cas2 CRISPR integrase. How cells generate the specific lengths of DNA required for integrase capture is a central unanswered question of type II-A CRISPR-based adaptive immunity. Here, we show that an integrase supercomplex comprising guide RNA and the proteins Cas1, Cas2, Csn2 and Cas9 generates precisely trimmed 30-base pair DNA molecules required for genome integration. The HNH active site of Cas9 catalyzes exonucleolytic DNA trimming by a mechanism that is independent of the guide RNA sequence. These results show that Cas9 possesses a distinct catalytic capacity for generating immunological memory in prokaryotes.
  2. ABSTRACT Viral infection exerts selection pressure on marine microbes, as virus-induced cell lysis causes 20 to 50% of cell mortality, resulting in fluxes of biomass into oceanic dissolved organic matter. Archaeal and bacterial populations can defend against viral infection using the clustered regularly interspaced short palindromic repeat (CRISPR)-associated (Cas) system, which relies on specific matching between a spacer sequence and a viral gene. If a CRISPR spacer match to any gene within a viral genome is equally effective in preventing lysis, no viral genes should be preferentially matched by CRISPR spacers. However, if there are differences in effectiveness, certain viral genes may demonstrate a greater frequency of CRISPR spacer matches. Indeed, homology search analyses of bacterioplankton CRISPR spacer sequences against virioplankton sequences revealed preferential matching of replication proteins, nucleic acid binding proteins, and viral structural proteins. Positive selection pressure for effective viral defense is one parsimonious explanation for these observations. CRISPR spacers from virioplankton metagenomes preferentially matched methyltransferase and phage integrase genes within virioplankton sequences. These virioplankton CRISPR spacers may assist infected host cells in defending against competing phage. Analyses also revealed that half of the spacer-matched viral genes were unknown, some genes matched several spacers, and some spacers matchedmore »multiple genes, a many-to-many relationship. Thus, CRISPR spacer matching may be an evolutionary algorithm, agnostically identifying those genes under stringent selection pressure for sustaining viral infection and lysis. Investigating this subset of viral genes could reveal those genetic mechanisms essential to virus-host interactions and provide new technologies for optimizing CRISPR defense in beneficial microbes. IMPORTANCE The CRISPR-Cas system is one means by which bacterial and archaeal populations defend against viral infection which causes 20 to 50% of cell mortality in the ocean. We tested the hypothesis that certain viral genes are preferentially targeted for the initial attack of the CRISPR-Cas system on a viral genome. Using CASC, a pipeline for CRISPR spacer discovery, and metagenome data from oceanic microbes and viruses, we found a clear subset of viral genes with high match frequencies to CRISPR spacers. Moreover, we observed a many-to-many relationship of spacers and viral genes. These high-match viral genes were involved in nucleotide metabolism, DNA methylation, and viral structure. It is possible that CRISPR spacer matching is an evolutionary algorithm pointing to those viral genes most important to sustaining infection and lysis. Studying these genes may advance the understanding of virus-host interactions in nature and provide new technologies for leveraging CRISPR-Cas systems in beneficial microbes.« less
  3. Abstract Background

    CRISPR-Cas (clustered regularly interspaced short palindromic repeats—CRISPR-associated proteins) systems are adaptive immune systems commonly found in prokaryotes that provide sequence-specific defense against invading mobile genetic elements (MGEs). The memory of these immunological encounters are stored in CRISPR arrays, where spacer sequences record the identity and history of past invaders. Analyzing such CRISPR arrays provide insights into the dynamics of CRISPR-Cas systems and the adaptation of their host bacteria to rapidly changing environments such as the human gut.


    In this study, we utilized 601 publicly availableBacteroides fragilisgenome isolates from 12 healthy individuals, 6 of which include longitudinal observations, and 222 availableB. fragilisreference genomes to update the understanding ofB. fragilisCRISPR-Cas dynamics and their differential activities. Analysis of longitudinal genomic data showed that some CRISPR array structures remained relatively stable over time whereas others involved radical spacer acquisition during some periods, and diverse CRISPR arrays (associated with multiple isolates) co-existed in the same individuals with some persisted over time. Furthermore, features of CRISPR adaptation, evolution, and microdynamics were highlighted through an analysis of host-MGE network, such as modules of multiple MGEs and hosts, reflecting complex interactions betweenB. fragilisand its invaders mediated through the CRISPR-Cas systems.


    We made available of all annotated CRISPR-Casmore »systems and their target MGEs, and their interaction network as a web resource at We anticipate it will become an important resource for studying ofB. fragilis, its CRISPR-Cas systems, and its interaction with mobile genetic elements providing insights into evolutionary dynamics that may shape the species virulence and lead to its pathogenicity.

    « less
  4. Abstract

    There is an increasing interest in the clustered regularly interspaced short palindromic repeats CRISPR-associated protein (CRISPR-Cas) system to reveal potential virus–host dynamics. The universal and most conserved Cas protein,cas1is an ideal marker to elucidate CRISPR-Cas ecology. We constructed eight Hidden Markov Models (HMMs) and assembledcas1directly from metagenomes by a targeted-gene assembler, Xander, to improve detection capacity and resolve the diverse CRISPR-Cas systems. The eight HMMs were first validated by recovering all 17cas1subtypes from the simulated metagenome generated from 91 prokaryotic genomes across 11 phyla. We challenged the targeted method with 48 metagenomes from a tallgrass prairie in Central Oklahoma recovering 3394cas1. Among those, 88 were near full length, 5 times more than in de-novo assemblies from the Oklahoma metagenomes. To validate the host assignment bycas1, the targeted-assembledcas1was mapped to the de-novo assembled contigs. All the phylum assignments of those mapped contigs were assigned independent of CRISPR-Cas genes on the same contigs and consistent with the host taxonomies predicted by the mappedcas1. We then investigated whether 8 years of soil warming alteredcas1prevalence within the communities. A shift in microbial abundances was observed during the year with the biggest temperature differential (mean 4.16 °C above ambient).cas1prevalence increased and even in the phylamore »with decreased microbial abundances over the next 3 years, suggesting increasing virus–host interactions in response to soil warming. This targeted method provides an alternative means to effectively minecas1from metagenomes and uncover the host communities.

    « less
  5. Abstract

    Reverse transcriptases (RTs) are found in different systems including group II introns, Diversity Generating Retroelements (DGRs), retrons, CRISPR-Cas systems, and Abortive Infection (Abi) systems in prokaryotes. Different classes of RTs can play different roles, such as template switching and mobility in group II introns, spacer acquisition in CRISPR-Cas systems, mutagenic retrohoming in DGRs, programmed cell suicide in Abi systems, and recently discovered phage defense in retrons. While some classes of RTs have been studied extensively, others remain to be characterized. There is a lack of computational tools for identifying and characterizing various classes of RTs. In this study, we built a tool (called myRT) for identification and classification of prokaryotic RTs. In addition, our tool provides information about the genomic neighborhood of each RT, providing potential functional clues. We applied our tool to predict RTs in all complete and draft bacterial genomes, and created a collection that can be used for exploration of putative RTs and their associated protein domains. Application of myRT to metagenomes showed that gut metagenomes encode proportionally more RTs related to DGRs, outnumbering retron-related RTs, as compared to the collection of reference genomes. MyRT is both available as a standalone software ( and also throughmore »a website (

    « less