The CRISPR-associated protein 9 (Cas9) has been engineered as a precise gene editing tool to make double-strand breaks. CRISPR-associated protein 9 binds the folded guide RNA (gRNA) that serves as a binding scaffold to guide it to the target DNA duplex via a RecA-like strand-displacement mechanism but without ATP binding or hydrolysis. The target search begins with the protospacer adjacent motif or PAM-interacting domain, recognizing it at the major groove of the duplex and melting its downstream duplex where an RNA-DNA heteroduplex is formed at nanomolar affinity. The rate-limiting step is the formation of an R-loop structure where the HNH domain inserts between the target heteroduplex and the displaced non-target DNA strand. Once the R-loop structure is formed, the non-target strand is rapidly cleaved by RuvC and ejected from the active site. This event is immediately followed by cleavage of the target DNA strand by the HNH domain and product release. Within CRISPR-associated protein 9, the HNH domain is inserted into the RuvC domain near the RuvC active site via two linker loops that provide allosteric communication between the two active sites. Due to the high flexibility of these loops and active sites, biophysical techniques have been instrumental in characterizing the dynamics and mechanism of the CRISPR-associated protein 9 nucleases, aiding structural studies in the visualization of the complete active sites and relevant linker structures. Here, we review biochemical, structural, and biophysical studies on the underlying mechanism with emphasis on how CRISPR-associated protein 9 selects the target DNA duplex and rejects non-target sequences.
more »
« less
Facilitation of DNA loop formation by protein–DNA non-specific interactions
Complex DNA topological structures, including polymer loops, are frequently observed in biological processes when protein molecules simultaneously bind to several distant sites on DNA. However, the molecular mechanisms of formation of these systems remain not well understood. Existing theoretical studies focus only on specific interactions between protein and DNA molecules at target sequences. However, the electrostatic origin of primary protein–DNA interactions suggests that interactions of proteins with all DNA segments should be considered. Here we theoretically investigate the role of non-specific interactions between protein and DNA molecules on the dynamics of loop formation. Our approach is based on analyzing a discrete-state stochastic model via a method of first-passage probabilities supplemented by Monte Carlo computer simulations. It is found that depending on a protein sliding length during the non-specific binding event three different dynamic regimes of the DNA loop formation might be observed. In addition, the loop formation time might be optimized by varying the protein sliding length, the size of the DNA molecule, and the position of the specific target sequences on DNA. Our results demonstrate the importance of non-specific protein–DNA interactions in the dynamics of DNA loop formations.
more »
« less
- Award ID(s):
- 1664218
- NSF-PAR ID:
- 10167971
- Date Published:
- Journal Name:
- Soft Matter
- Volume:
- 15
- Issue:
- 26
- ISSN:
- 1744-683X
- Page Range / eLocation ID:
- 5255 to 5263
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The CRISPR-associated protein 9 (Cas9) has been engineered as a precise gene editing tool to make double-strand breaks. CRISPR-associated protein 9 binds the folded guide RNA (gRNA) that serves as a binding scaffold to guide it to the target DNA duplex via a RecA-like strand-displacement mechanism but without ATP binding or hydrolysis. The target search begins with the protospacer adjacent motif or PAM-interacting domain, recognizing it at the major groove of the duplex and melting its downstream duplex where an RNA-DNA heteroduplex is formed at nanomolar affinity. The rate-limiting step is the formation of an R-loop structure where the HNH domain inserts between the target heteroduplex and the displaced non-target DNA strand. Once the R-loop structure is formed, the non-target strand is rapidly cleaved by RuvC and ejected from the active site. This event is immediately followed by cleavage of the target DNA strand by the HNH domain and product release. Within CRISPR-associated protein 9, the HNH domain is inserted into the RuvC domain near the RuvC active site via two linker loops that provide allosteric communication between the two active sites. Due to the high flexibility of these loops and active sites, biophysical techniques have been instrumental in characterizing the dynamics and mechanism of the CRISPR-associated protein 9 nucleases, aiding structural studies in the visualization of the complete active sites and relevant linker structures. Here, we review biochemical, structural, and biophysical studies on the underlying mechanism with emphasis on how CRISPR-associated protein 9 selects the target DNA duplex and rejects non-target sequences.more » « less
-
The assembly of synaptic protein-DNA complexes by specialized proteins is critical for bringing together two distant sites within a DNA molecule or bridging two DNA molecules. The assembly of such synaptosomes is needed in numerous genetic processes requiring the interactions of two or more sites. The molecular mechanisms by which the protein brings the sites together, enabling the assembly of synaptosomes, remain unknown. Such proteins can utilize sliding, jumping, and segmental transfer pathways proposed for the single-site search process, but none of these pathways explains how the synaptosome assembles. Here we used restriction enzyme SfiI, that requires the assembly of synaptosome for DNA cleavage, as our experimental system and applied time-lapse, high-speed AFM to directly visualize the site search process accomplished by the SfiI enzyme. For the single-site SfiI-DNA complexes, we were able to directly visualize such pathways as sliding, jumping, and segmental site transfer. However, within the synaptic looped complexes, we visualized the threading and site-bound segment transfer as the synaptosome-specific search pathways for SfiI. In addition, we visualized sliding and jumping pathways for the loop dissociated complexes. Based on our data, we propose the site-search model for synaptic protein-DNA systems.more » « less
-
The inhibition of protein–protein interactions is a growing strategy in drug development. In addition to structured regions, many protein loop regions are involved in protein–protein interactions and thus have been identified as potential drug targets. To effectively target such regions, protein structure is critical. Loop structure prediction is a challenging subgroup in the field of protein structure prediction because of the reduced level of conservation in protein sequences compared to the secondary structure elements. AlphaFold 2 has been suggested to be one of the greatest achievements in the field of protein structure prediction. The AlphaFold 2 predicted protein structures near the X-ray resolution in the Critical Assessment of protein Structure Prediction (CASP 14) competition in 2020. The purpose of this work is to survey the performance of AlphaFold 2 in specifically predicting protein loop regions. We have constructed an independent dataset of 31,650 loop regions from 2613 proteins (deposited after the AlphaFold 2 was trained) with both experimentally determined structures and AlphaFold 2 predicted structures. With extensive evaluation using our dataset, the results indicate that AlphaFold 2 is a good predictor of the structure of loop regions, especially for short loop regions. Loops less than 10 residues in length have an average Root Mean Square Deviation (RMSD) of 0.33 Å and an average the Template Modeling score (TM-score) of 0.82. However, we see that as the number of residues in a given loop increases, the accuracy of AlphaFold 2’s prediction decreases. Loops more than 20 residues in length have an average RMSD of 2.04 Å and an average TM-score of 0.55. Such a correlation between accuracy and length of the loop is directly linked to the increase in flexibility. Moreover, AlphaFold 2 does slightly over-predict α-helices and β-strands in proteins.more » « less
-
Transcription factor (TF) target search on genome is highly essential for gene expression and regulation. High-resolution determination of TF diffusion along DNA remains technically challenging. Here, we constructed a TF model system using the plant WRKY domain protein in complex with DNA from crystallography and demonstrated microsecond diffusion dynamics of WRKY on DNA by employing all-atom molecular-dynamics (MD) simulations. Notably, we found that WRKY preferentially binds to one strand of DNA with significant energetic bias compared with the other, or nonpreferred strand. The preferential DNA-strand binding becomes most prominent in the static process, from nonspecific to specific DNA binding, but less distinct during diffusive movements of the domain protein on the DNA. Remarkably, without employing acceleration forces or bias, we captured a complete one-base-pair stepping cycle of the protein tracking along major groove of DNA with a homogeneous poly-adenosine sequence, as individual hydrogen bonds break and reform at the protein–DNA binding interface. Further DNA-groove tracking motions of the protein forward or backward, with occasional sliding as well as strand crossing to minor groove of DNA, were also captured. The processive diffusion of WRKY along DNA has been further sampled via coarse-grained MD simulations. The study thus provides structural dynamics details on diffusion of a small TF domain protein, suggests how the protein approaches a specific recognition site on DNA, and supports further high-precision experimental detection. The stochastic movements revealed in the TF diffusion also provide general clues about how other protein walkers step and slide along DNA.more » « less