Abstract Integrases from the “large serine” family are simple, highly directional site-specific DNA recombinases that have great promise as synthetic biology and genome editing tools. Integrative recombination (mimicking phage or mobile element insertion) requires only integrase and two short (∼40–50) DNA sites. The reverse reaction, excisive recombination, does not occur until it is triggered by the presence of a second protein termed a recombination directionality factor (RDF), which binds specifically to its cognate integrase. Identification of RDFs has been hampered due to their lack of sequence conservation and lack of synteny with the phage integrase gene. Here we use AlphaFold2-multimer to identify putative RDFs for more than half of a test set of 98 large serine recombinases, and experimental methods to verify predicted RDFs for 6 of 9 integrases chosen as test cases. We find no universally conserved structural motifs among known and predicted RDFs, yet they are all predicted to bind a similar location on their cognate integrase, suggesting convergent evolution of function. Our methodology greatly expands the available genetic toolkit of cognate integrase–RDF pairs.
more »
« less
This content will become publicly available on December 1, 2025
Variable orthogonality of serine integrase interactions within the ϕC31 family
Abstract Serine integrases are phage- (or mobile element-) encoded enzymes that catalyse site-specific recombination reactions between a short DNA sequence on the phage genome (attP) and a corresponding host genome sequence (attB), thereby integrating the phage DNA into the host genome. Each integrase has its unique pair ofattPandattBsites, a feature that allows them to be used as orthogonal tools for genome modification applications. In the presence of a second protein, the Recombination Directionality Factor (RDF), integrase catalyses the reverse excisive reaction, generating new recombination sites,attRandattL. In addition to promotingattRxattLreaction, the RDF inhibitsattPxattBrecombination. This feature makes the directionality of integrase reactions programmable, allowing them to be useful for building synthetic biology devices. In this report, we describe the degree of orthogonality of both integrative and excisive reactions for three related integrases (ϕC31, ϕBT1, and TG1) and their RDFs. Among these, TG1 integrase is the most active, showing near complete recombination in bothattPxattBandattRxattLreactions, and the most directional in the presence of its RDF. Our findings show that there is varying orthogonality among these three integrases – RDF pairs. ϕC31 integrase was the least selective, with all three RDFs activating it forattRxattLrecombination. Similarly, ϕC31 RDF was the least effective among the three RDFs in promoting the excisive activities of the integrases, including its cognate ϕC31 integrase. ϕBT1 and TG1 RDFs were noticeably more effective than ϕC31 RDF at inhibitingattPxattBrecombination by their respective integrases, making them more suitable for building reversible genetic switches. AlphaFold-Multimer predicts very similar structural interactions between each cognate integrase – RDF pair. The binding surface on the RDF is much more conserved than the binding surface on the integrase, an indication that specificity is determined more by the integrase than the RDF. Overall, the observed weak integrase/RDF orthogonality across the three enzymes emphasizes the need for identifying and characterizing more integrase – RDF pairs. Additionally, the ability of a particular integrase’s preferred reaction direction to be controlled to varying degrees by non-cognate RDFs provides a path to tunable, non-binary genetic switches.
more »
« less
- Award ID(s):
- 2223480
- PAR ID:
- 10632877
- Publisher / Repository:
- Nature Portfolio
- Date Published:
- Journal Name:
- Scientific Reports
- Volume:
- 14
- Issue:
- 1
- ISSN:
- 2045-2322
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Recombination directionality factors (RDFs) for large serine integrases (LSIs) are cofactor proteins that control the directionality of recombination to favour excision over insertion. Although RDFs are predicted to bind their cognate LSIs in similar ways, there is no overall common structural theme across LSI RDFs, leading to the suggestion that some of them may be moonlighting proteins with other primary functions. To test this hypothesis, we searched for characterized proteins with structures similar to the predicted structures of known RDFs. Our search shows that the RDFs for two LSIs, TG1 integrase and Bxb1 integrase, show high similarities to a single-stranded DNA binding (SSB) protein and an editing exonuclease, respectively. We present experimental data to show that Bxb1 RDF is probably an exonuclease and TG1 RDF is a functional SSB protein. We used mutational analysis to validate the integrase-RDF interface predicted by AlphaFold2 multimer for TG1 integrase and its RDF, and establish that control of recombination directionality is mediated via protein–protein interaction at the junction of recombinase’s second DNA binding domain and the base of the coiled-coil domain.more » « less
-
null (Ed.)Abstract Streptomyces phage ϕC31 integrase (Int)—a large serine site-specific recombinase—is autonomous for phage integration (attP x attB recombination) but is dependent on the phage coded gp3, a recombination directionality factor (RDF), for prophage excision (attL x attR recombination). A previously described activating mutation, E449K, induces Int to perform attL x attR recombination in the absence of gp3, albeit with lower efficiency. E449K has no adverse effect on the competence of Int for attP x attB recombination. Int(E449K) resembles Int in gp3 mediated stimulation of attL x attR recombination and inhibition of attP x attB recombination. Using single-molecule analyses, we examined the mechanism by which E449K activates Int for gp3-independent attL x attR recombination. The contribution of E449K is both thermodynamic and kinetic. First, the mutation modulates the relative abundance of Int bound attL-attR site complexes, favoring pre-synaptic (PS) complexes over non-productively bound complexes. Roughly half of the synaptic complexes formed from Int(E449K) pre-synaptic complexes are recombination competent. By contrast, Int yields only inactive synapses. Second, E449K accelerates the dissociation of non-productively bound complexes and inactive synaptic complexes formed by Int. The extra opportunities afforded to Int(E499K) in reattempting synapse formation enhances the probability of success at fruitful synapsis.more » « less
-
The relative velocities and positions of monodisperse high-inertia particle pairs in isotropic turbulence are studied using direct numerical simulations (DNS), as well as Langevin simulations (LS) based on a probability density function (PDF) kinetic model for pair relative motion. In a prior study (Rani et al. , J. Fluid Mech. , vol. 756, 2014, pp. 870–902), the authors developed a stochastic theory that involved deriving closures in the limit of high Stokes number for the diffusivity tensor in the PDF equation for monodisperse particle pairs. The diffusivity contained the time integral of the Eulerian two-time correlation of fluid relative velocities seen by pairs that are nearly stationary. The two-time correlation was analytically resolved through the approximation that the temporal change in the fluid relative velocities seen by a pair occurs principally due to the advection of smaller eddies past the pair by large-scale eddies. Accordingly, two diffusivity expressions were obtained based on whether the pair centre of mass remained fixed during flow time scales, or moved in response to integral-scale eddies. In the current study, a quantitative analysis of the (Rani et al. 2014) stochastic theory is performed through a comparison of the pair statistics obtained using LS with those from DNS. LS consist of evolving the Langevin equations for pair separation and relative velocity, which is statistically equivalent to solving the classical Fokker–Planck form of the pair PDF equation. Langevin simulations of particle-pair dispersion were performed using three closure forms of the diffusivity – i.e. the one containing the time integral of the Eulerian two-time correlation of the seen fluid relative velocities and the two analytical diffusivity expressions. In the first closure form, the two-time correlation was computed using DNS of forced isotropic turbulence laden with stationary particles. The two analytical closure forms have the advantage that they can be evaluated using a model for the turbulence energy spectrum that closely matched the DNS spectrum. The three diffusivities are analysed to quantify the effects of the approximations made in deriving them. Pair relative-motion statistics obtained from the three sets of Langevin simulations are compared with the results from the DNS of (moving) particle-laden forced isotropic turbulence for $$St_{\unicode[STIX]{x1D702}}=10,20,40,80$$ and $$Re_{\unicode[STIX]{x1D706}}=76,131$$ . Here, $$St_{\unicode[STIX]{x1D702}}$$ is the particle Stokes number based on the Kolmogorov time scale and $$Re_{\unicode[STIX]{x1D706}}$$ is the Taylor micro-scale Reynolds number. Statistics such as the radial distribution function (RDF), the variance and kurtosis of particle-pair relative velocities and the particle collision kernel were computed using both Langevin and DNS runs, and compared. The RDFs from the stochastic runs were in good agreement with those from the DNS. Also computed were the PDFs $$\unicode[STIX]{x1D6FA}(U|r)$$ and $$\unicode[STIX]{x1D6FA}(U_{r}|r)$$ of relative velocity $$U$$ and of the radial component of relative velocity $$U_{r}$$ respectively, both PDFs conditioned on separation $$r$$ . The first closure form, involving the Eulerian two-time correlation of fluid relative velocities, showed the best agreement with the DNS results for the PDFs.more » « less
-
We consider the problem of answering temporal queries on RDF stores, in presence of atemporal RDFS domain ontologies, of relational data sources that include temporal information, and of rules that map the domain information in the source schemas into the target ontology. Our proposed practice-oriented solution consists of two rule-based domain-independent algorithms. The first algorithm materializes target RDF data via a version of data exchange that enriches both the data and the ontology with temporal information from the relational sources. The second algorithm accepts as inputs temporal queries expressed in terms of the domain ontology using a lightweight temporal extension of SPARQL, and ensures successful evaluation of the queries on the materialized temporally-enriched RDF data. To study the quality of the information generated by the algorithms, we develop a general framework that formalizes the relational-to-RDF temporal data-exchange problem. The framework includes a chase formalism and a formal solution for the problem of answering temporal queries in the context of relational-to-RDF temporal data exchange. In this article, we present the algorithms and the formal framework that proves correctness of the information output by the algorithms, and also report on the algorithm implementation and experimental results for two application domains.more » « less
An official website of the United States government
