<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Structure of a RecT/Redβ family recombinase in complex with a duplex intermediate of DNA annealing</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>12/01/2022</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10426474</idno>
					<idno type="doi">10.1038/s41467-022-35572-z</idno>
					<title level='j'>Nature Communications</title>
<idno>2041-1723</idno>
<biblScope unit="volume">13</biblScope>
<biblScope unit="issue">1</biblScope>					

					<author>Brian J. Caldwell</author><author>Andrew S. Norris</author><author>Caroline F. Karbowski</author><author>Alyssa M. Wiegand</author><author>Vicki H. Wysocki</author><author>Charles E. Bell</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Abstract                          Some bacteriophage encode a recombinase that catalyzes single-stranded DNA annealing (SSA). These proteins are apparently related to RAD52, the primary human SSA protein. The best studied protein, Redβ from bacteriophage λ, binds weakly to ssDNA, not at all to dsDNA, but tightly to a duplex intermediate of annealing formed when two complementary DNA strands are added to the protein sequentially. We used single particle cryo-electron microscopy (cryo-EM) to determine a 3.4Å structure of a Redβ homolog from a prophage of              Listeria innocua              in complex with two complementary 83mer oligonucleotides. The structure reveals a helical protein filament bound to a DNA duplex that is highly extended and unwound. Native mass spectrometry confirms that the complex seen by cryo-EM is the predominant species in solution. The protein shares a common core fold with RAD52 and a similar mode of ssDNA-binding. These data provide insights into the mechanism of protein-catalyzed SSA.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Introduction</head><p>Bacteriophage with dsDNA genomes often encode a recombination system that consists of two proteins: a 5'-3' exonuclease for resecting DNA ends, and a recombinase for binding to the resulting 3'-overhang and annealing it to a complementary strand from a homologous duplex <ref type="bibr">(1)</ref>. The two proteins form a complex that is thought to load the annealing protein directly onto the 3'-overhang as it is formed by the exonuclease <ref type="bibr">(2,</ref><ref type="bibr">3)</ref>.</p><p>The benefit of these recombination systems for the phage has not been firmly established, but possible roles in replication (4), genome packaging <ref type="bibr">(1)</ref>, promoting genetic diversity <ref type="bibr">(5,</ref><ref type="bibr">6)</ref>, and CRISPR-evasion <ref type="bibr">(7)</ref> have been proposed. These systems are also found in bacterial genomes within cryptic or active prophage <ref type="bibr">(8)</ref>, and in mobile genetic elements such as integrating conjugative elements <ref type="bibr">(9)</ref> and conjugative plasmids <ref type="bibr">(7)</ref> where they can contribute to antibiotic resistance and genetic diversity <ref type="bibr">(10)</ref>. While their precise roles in biology are still being studied, the proteins of these recombination systems have been widely exploited in powerful methods for bacterial genome engineering known as recombineering and MAGE (multiplex automated genome engineering) <ref type="bibr">(11)</ref><ref type="bibr">(12)</ref><ref type="bibr">(13)</ref>.</p><p>The best studied of these recombination systems is the Red system from bacteriophage &#61548;, for which the exonuclease and annealing proteins are &#61548; exo and Red&#61538;, respectively <ref type="bibr">(14)</ref>. &#61548; exo (Mr 24.9 kDa) forms a ring-shaped homotrimer that binds to dsDNA ends and processively digests the 5'-strand to form a long 3'-overhang <ref type="bibr">(15)</ref><ref type="bibr">(16)</ref>.</p><p>Red&#61538; is a 30 kDa monomer that binds to ssDNA and promotes the annealing of complementary strands <ref type="bibr">(17,</ref><ref type="bibr">18)</ref>. It binds weakly to ssDNA, not at all to pre-formed dsDNA, but tightly to a duplex intermediate of annealing formed when two complementary strands of DNA are added to the protein sequentially <ref type="bibr">(19)</ref>. Coupled to this, Red&#61538; exhibits a dynamic oligomerization in forming rings (or split lock washers) on ssDNA, but helical filaments on annealed duplex <ref type="bibr">(20,</ref><ref type="bibr">21)</ref>.</p><p>Red&#61538; belongs to a large group of proteins annotated as the RecT family based on the protein from the rac prophage of E. coli <ref type="bibr">(22)</ref>. The current Pfam database lists 1549 such sequences, predominantly from bacteriophage or prophage genomes, with zero structures <ref type="bibr">(23)</ref>. While this family of proteins was originally thought to be distinct from RAD52 <ref type="bibr">(24)</ref>, the primary SSA protein in human cells <ref type="bibr">(25)</ref>, more recent sequence comparisons suggest that they are in fact related <ref type="bibr">(21,</ref><ref type="bibr">26,</ref><ref type="bibr">27)</ref>. The structure of an 11-mer ring form of the DNA-binding domain of RAD52 has been determined without DNA <ref type="bibr">(28,</ref><ref type="bibr">29)</ref> and with a dT40 oligonucleotide to form a substrate complex <ref type="bibr">(30)</ref>. However, there is no structure of RAD52 with two complementary strands of ssDNA bound simultaneously, and its overall mechanism of annealing is still unknown.</p><p>Here, we have used single-particle cryo-EM to determine a 3.4&#197; structure of a homolog of &#61548;-Red&#61538; from the A118 prophage of Listeria innocua that we will refer to as LiRecT. The structure reveals a left-handed helical filament of the protein bound to an 83-mer duplex intermediate of DNA annealing. The filaments are similar to those seen previously for &#61548;-Red&#61538; at low resolution by electron microscopy and atomic force microscopy <ref type="bibr">(20,</ref><ref type="bibr">21)</ref>, but our structure now reveals the fold of the protein, the location of the DNA binding groove, the conformation of the DNA, and the details of the protein-DNA and inter-subunit interactions. The structure confirms the similarity to RAD52, and reveals a common core fold and shared mode of ssDNA-binding.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Results</head><p>Architecture of the LiRecT-DNA Complex. The LiRecT protein was purified and found to bind to ssDNA and form a complex with annealed duplex in a similar manner as &#61548;-Red&#946;, both in phosphate buffered saline (pH 7.4) and in a buffer that was previously used for negative stain EM of &#955;-Red&#946; (10 mM KH2PO4, 10 mM MgCl2, pH 6.0) <ref type="bibr">(20)</ref> (Supplementary Fig. <ref type="figure">1</ref>). For cryo-EM analysis in the latter buffer, a complex of LiRecT with duplex intermediate was formed by incubating the protein with two complementary 83-mer oligonucleotides that were added to the protein sequentially. The sequences of the oligonucleotides were derived from a naturally occurring sequence in M13 DNA described previously <ref type="bibr">(19,</ref><ref type="bibr">31)</ref>. The complex appeared as helical filaments of varying lengths, including some with end-on views (Fig. <ref type="figure">1a</ref>). Standard single-particle analysis without helical symmetry averaging in cryoSPARC yielded a 3.4&#197; reconstruction with fully interpretable density for the LiRecT subunits and the bound DNA at the central portion of the filament. The single particle workflow is shown in Supplementary Figs. <ref type="figure">2</ref> and <ref type="figure">3</ref> and the data collection and refinement statistics are shown in Supplementary Table <ref type="table">1</ref>.</p><p>In the complex, LiRecT assembles into a left-handed helical filament that is highly reminiscent of those seen previously for &#61548;-Red&#946; <ref type="bibr">(20)</ref>. The filament has an open corkscrew-like shape with an inner diameter of 20 &#197;, an outer diameter of 100 &#197;, and a pitch of 105 &#197; with approximately 10 subunits per turn. The two complementary 83-mer strands are bound as a highly extended and un-wound duplex to a deep, narrow, positively-charged groove that runs along the outer surface of the filament (Fig. <ref type="figure">2</ref>). One strand, which we call "inner" and color in yellow is bound to the deepest part of the groove with its nucleotide bases facing outwards. The other strand, which we call "outer" (orange) is bound to the outer portion of the groove with its bases facing inward to form normal Watson-Crick base pairs with the inner strand.</p><p>Each monomer of LiRecT binds to 5 bp of DNA. Based on this ratio, we would expect the filament to contain 16-17 subunits of LiRecT bound to the 83-mer duplex.</p><p>While we do see a filament of approximately this length in the 3D reconstruction, the density towards the ends of the filament gets progressively weaker (Fig <ref type="figure">1c</ref>), presumably due to flexibility and/or imperfect alignment of the particles along the filament axis.</p><p>Consequently, we chose to refine a model that consists of just the 10 subunits of protein and 48 bp of DNA at the central portion of the filament (Fig. <ref type="figure">1d</ref>), for which the density is strongest. Due to the helical symmetry however, this model likely encompasses all of the relevant protein-protein and protein-DNA interactions that exist in the full filament, except at the ends. In addition, although the resolution of the map was high enough to clearly see nucleotide bases (Supplementary 2f), purines and pyrimidines could not be distinguished, likely due to the imperfect alignment of the particles along the filament axis. The DNA has thus been modeled as dT48 for the outer strand, and dA48 for the inner strand, despite the fact that both strands contain a natural variation of all four nucleotides. Finally, based on the measured helical parameters of 10 subunits per turn, we would expect the filaments to contain approximately 1.5 turns. In the cryo-EM images however, many of the filaments contain several turns (Fig. <ref type="figure">1a</ref>), suggesting that they can stack end-to-end. The result of single particle analysis however converged on just a single 1.5-turn filament.</p><p>LiRecT Monomer Fold and Relation to RAD52. The structure reveals that LiRecT and by extension the RecT/Red&#946; family of proteins does indeed share structural similarity with RAD52 (Fig. <ref type="figure">3</ref>), as had been predicted <ref type="bibr">(21,</ref><ref type="bibr">26,</ref><ref type="bibr">27)</ref>. In a pairwise superposition using the DALI server <ref type="bibr">(32)</ref>, the two structures superimpose to an RMSD of 5.5 &#197; for 83 pairs of C&#61537; atoms that share 14% sequence identity. The structural superposition is shown in Supplementary Fig. <ref type="figure">4</ref>, and the structure-based sequence alignment in Supplementary Fig. <ref type="figure">5</ref>. The common core covers 43% of the LiRecT structure of 191 amino acids, and 31% of the full-length LiRecT protein of 271 amino acids. Despite the common core identified in the pairwise superposition, RAD52 was not identified as a top hit in a DALI search of the Protein Data Bank for structural homologs <ref type="bibr">(32)</ref>, reflecting the high degree of structural difference. The common core fold consists of 2 central &#61537;-helices (&#945;2 and &#945;3) that form the base of the DNA binding groove, combined with a beta hairpin (&#946;1-&#946;2) on one side and a three-stranded antiparallel beta sheet (&#946;3-&#946;5) on the other. In Fig. <ref type="figure">3</ref> we have numbered the common core secondary structural elements of LiRecT based on RAD52 and used letters for inserted elements, which are shown in green. The first insertion is an N-terminal 3-helix bundle (&#945;A, &#945;B, &#945;C) that sits at the upper rim of the filament and packs with neighboring copies of itself from the adjacent subunits (Fig. <ref type="figure">4</ref>).</p><p>The second is a &#946;-hairpin (&#946;A-&#946;B) inserted after &#946;3 that interacts with &#946;3 of the neighboring subunit at the lower rim of the filament (Fig. <ref type="figure">4</ref>). The third is a pair of &#945;helices (&#945;D and &#945;E) inserted after &#946;5 that pack with &#945;3 to form the lower rim of the DNAbinding groove (Fig. <ref type="figure">4</ref>). Compared to RAD52, the &#946;3-&#946;5 sheet is shorter in LiRecT: the upper portions of &#946;3 and &#946;4 fold back onto the sheet to form the &#946;A-&#946;B insertion, and the upper portion of &#946;5 is replaced by the &#945;D-&#945;E helical hairpin.</p><p>The modeled portion of each LiRecT monomer consists of residues 34-224 of the 271 amino acid protein. The additional residues at the N-and C-terminal ends, which are presumably disordered relative to the main body of the filament, would project from the upper and inner surfaces of the filament, respectively (Fig. <ref type="figure">4a</ref>). Comparisons of the LiRecT structure to predicted structures of it and of &#955;-Red&#946; from RoseTTAFold are shown in Supplementary Fig. <ref type="figure">6</ref>  <ref type="bibr">(33)</ref>. A structure-based sequence alignment of LiRecT and &#955;-Red&#61538; is shown in Supplementary Fig. <ref type="figure">7</ref>.</p><p>Protein-DNA Interactions. The two strands of DNA are bound to a deep, narrow, positively charged groove that runs along the outer surface of the filament (Fig. <ref type="figure">2</ref>). The base of the groove is formed by &#945;2 and &#945;3 and its outer walls are formed by the &#946;1-&#946;2 hairpin on one side and the &#945;D-&#945;E insertion on the other (Figs. <ref type="figure">3</ref> and <ref type="figure">4</ref>). The inner strand is bound to the deepest part of the groove where it is contacted by the side chains of Y110 and K111 from &#945;2, and K206, R210, N211, and K215 from &#945;3 (Fig. <ref type="figure">3c</ref>). These residues form extensive interactions with the sugar phosphate backbone of the inner strand and hold it in an irregular conformation that is periodically kinked (Fig. <ref type="figure">4</ref>). By contrast, the outer strand makes far fewer interactions with the protein and adopts a smoother conformation that is held in place primarily by normal Watson-Crick base pair interactions with the inner strand. The few residues that do contact the outer strand are K101 at the tip of the &#946;1-&#946;2 hairpin, and K191 and F194 from the &#945;D-&#945;E insertion (Fig. <ref type="figure">3d</ref>). Although most of the contacts involve the sugar-phosphate backbone of each strand, the side chains of V98, Y100, and Q107 from the &#946;1-&#946;2 hairpin wedge into the base pairs at every 5 th bp step to separate them (Fig. <ref type="figure">3e</ref>). Specifically, the phenyl ring of Y100 of each subunit stacks with the base of every 5 th nucleotide of the outer strand, while the side chain of Q107 contacts the opposing base of the inner strand. These interactions introduce a dramatic kink in the backbone of the inner strand where the bases are separated (Fig. <ref type="figure">4</ref>). Many of the residues that contact the DNA, particularly those that contact the inner strand, are highly conserved among six distant homologs of LiRecT identified by PSI-BLAST (Supplementary Fig. <ref type="figure">8</ref>). This suggests that the structure has captured a functionally relevant state of the protein.</p><p>Although the two DNA strands mostly contact one another via normal Watson Crick base pair interactions, the duplex is highly extended compared to B-form DNA and completely un-wound (Fig. <ref type="figure">5</ref>). In concert with the 5 bp/monomer stoichiometry, the bases are stacked in a repeating pattern, with groups of 5 bp stacked with approximately 3.8 &#197; spacing, alternating with a larger 9 &#197; spacing where the &#946;1-&#946;2 insertion occurs (Fig. <ref type="figure">5c</ref>). Overall, the duplex is about 1.5 times as extended as B-form DNA and is completely unwound. The local base pair step parameters deviate significantly from Bform DNA in a regularly repeating manner every 5 bp (Supplementary Fig. <ref type="figure">9</ref>). This is largely due to the irregular and bent conformation of the inner strand.</p><p>Inter-subunit Packing. The LiRecT subunits pack in the filament with interactions that bury 1830 &#197; 2 of total solvent accessible surface area. The interface largely consists of two separate hydrophobic cores, one formed by the N-terminal helix bundles on top of the DNA binding groove, and the other by the &#61538;3-&#61538;5 sheet and &#945;2-&#945;3 below the DNA binding groove (Fig. <ref type="figure">4</ref>). The upper core is formed by F41, V44, T76, and T83 from the left subunit (as viewed in Fig. <ref type="figure">4b</ref>), and F52, L53, L56, and L57 from the right subunit.</p><p>The lower core is formed by F171, W216, and I218 from the left subunit, and I114, L118, and I126 from the right subunit (Fig. <ref type="figure">4c</ref>). Both of these cores are surrounded by smaller sets of electrostatic interactions. At the upper rim, K40 and S77 of the left subunit form hydrogen bonds with D46 and N61 of the right subunit (Fig. <ref type="figure">4b</ref>). At the lower rim, E144 and R141 of the left subunit form hydrogen bonds with N127 and E135 of the right subunit (Fig. <ref type="figure">4c</ref>). Many of the residues involved in the inter-subunit packing are conserved in distant homologs of LiRecT (Supplementary Fig. <ref type="figure">8</ref>), suggesting that the subunit packing and overall filament structure are likely to be conserved.</p><p>Comparison to RAD52. The LiRecT structure permits a structure-based sequence alignment with RAD52 to identify the equivalent sets of residues used for interacting with DNA and neighboring subunits (Supplementary Fig. <ref type="figure">5</ref>). First and foremost, the inner strand in the complex with LiRecT closely overlays with the dT40 bound to the "inner" site of RAD52 (Fig. <ref type="figure">3</ref> and Supplementary Fig. <ref type="figure">4a</ref>). Both strands are bound to the same position deep at the base of their respective grooves, where they are contacted by equivalent sets of residues extending from &#945;3 (Supplementary Fig. <ref type="figure">10</ref>). Specifically, K206, R210, N211, and S214 from &#945;3 of LiRecT correspond precisely to T148, K152, R153, and R156 from &#945;3 of RAD52 (Supplementary Figs. <ref type="figure">10c</ref> and <ref type="figure">10f</ref>). The outer strand of LiRecT approximately overlays with the ssDNA bound to the outer site of RAD52, but the latter is bound in a helical conformation that is clearly not poised for annealing (Supplementary Figs. <ref type="figure">10b</ref>, <ref type="figure">10d</ref>, and <ref type="figure">10g</ref>). Both proteins use the conserved &#946;1-&#946;2 hairpin to wedge into the DNA strands, and V98 from &#946;1 of LiRecT is precisely equivalent to R55 from &#946;1 of RAD52. In LiRecT the &#946;1-&#946;2 hairpin separates the base pairs by 9&#197;, whereas in RAD52 it separates the inner strand bases by 11&#197; (Supplementary Figs. <ref type="figure">10e</ref> and <ref type="figure">10h</ref>). Although our structure of LiRecT captures the protein in a helical filament, and the structures of RAD52 reveal an 11-mer oligomeric ring, the two proteins use the same basic parts of their monomers for inter-subunit packing (Supplementary Fig. <ref type="figure">11</ref>), suggesting that the oligomers could be related. At the sequence level, the most conserved part of the LiRecT and RAD52 structures is the interface between &#945;2 and &#945;3, which in both proteins is integral to the binding of the inner strand and the inter-subunit packing interactions. Finally, while the stoichiometry of the RAD52-ssDNA complex is 4 nt/monomer, the complex of LiRecT with annealed duplex has 5 bp/monomer. Whether the two proteins have slightly different stoichiometries, or there is a change in stoichiometry when the second strand is incorporated, remains to be determined.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Structure of LiRecT in Complex with ssDNA.</head><p>Prior work on &#61548;-Red&#61538; revealed that it binds ssDNA as oligomeric rings, and then forms helical filaments once a second complementary strand is added <ref type="bibr">(20)</ref>. To determine if there is a similar structural transition for LiRecT, we prepared a complex of it with just one 83-mer ssDNA and obtained a ~5&#197; resolution cryo-EM reconstruction by single particle analysis (Supplementary Figs. <ref type="figure">12</ref> and <ref type="figure">13</ref>). Surprisingly, the LiRecT-ssDNA complex also exists as left-handed helical filaments, instead of as rings, but they are not as well ordered, and they do not stack end-to-end. Using a monomer of LiRecT from the complex with annealed duplex, 8 subunits of LiRecT could be docked into the reconstruction for the central portion of the filament. Due to the lower resolution of this reconstruction, we could not fit the ssDNA to the map, although there is strong density for DNA in the groove above &#945;2 and &#945;3 (Supplementary Fig. <ref type="figure">12f</ref>). Moreover, cryo-EM images collected for LiRecT protein without DNA reveal smaller particles that are much less well ordered (Supplementary Fig. <ref type="figure">14</ref>), suggesting that the filament is assembled on the ssDNA (a full data set without DNA was not collected). Strikingly, the density for the portion of each LiRecT subunit at the upper rim of the filament is almost completely absent in the structure with ssDNA, for the entire length of the filament (Supplementary Fig. <ref type="figure">12</ref>). This upper N-terminal lobe of each monomer (N-lobe), which is formed by the &#945;A-&#945;C bundle and the &#946;1-&#946;2 hairpin (Fig. <ref type="figure">4a</ref>), would likely clamp down on the DNA once the second strand is incorporated, to form the additional protein-DNA and intersubunit interactions that are shown in Fig. <ref type="figure">4</ref> for the complex with annealed duplex. These interactions would further stabilize the filament complex to consolidate annealing. This provides a possible structural explanation for the dramatic increase in stability of the complex with two complementary strands that has been observed by gel-shift and single-molecule experiments for &#955;-Red&#61538; <ref type="bibr">(19,</ref><ref type="bibr">34)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Analysis of LiRecT-DNA Complexes Formed in Solution.</head><p>To determine if the complexes of LiRecT seen by cryo-EM also exist in solution, and in particular if the predicted full-length complex with two 83-mers is formed since the ends of the filament were less well-ordered, mixtures of LiRecT protein alone, with 83-mer ssDNA, and with two complementary 83-mers added sequentially were analyzed by native mass spectrometry (nMS). Raw and deconvolved mass spectra for each sample are shown in Supplementary Figs. <ref type="bibr">15</ref> and 16, and a heat map summary of the oligomeric species formed for each protein-DNA mixture is shown in Fig. <ref type="figure">6</ref>. Source data used to generate Fig. <ref type="figure">6</ref> are shown in Supplementary Table <ref type="table">2</ref>. Free LiRecT protein was largely monomeric at low concentration (1 &#181;M), and while increasing the concentration to 30 &#61549;M resulted in some oligomer formation (up to 9-mers), no distinct oligomeric species was converged upon (Fig. <ref type="figure">6</ref> and Supplementary Fig. <ref type="figure">15</ref>). This supports the conclusion from cryo-EM images that filament assembly requires ssDNA. Mixing of LiRecT with one 83-mer ssDNA resulted in two types of complexes, one with 7-10 LiRecT subunits and one copy of the 83-mer (green in Fig. <ref type="figure">6</ref>), and another with 15-17 subunits of LiRecT and two copies of the same 83-mer (blue in Fig. <ref type="figure">6</ref>). Based on our previous results for &#61548;-Red&#61538; (31), we interpret the smaller complexes (green) as initial LiRecT-ssDNA substrate complexes, and the larger complexes (blue) as attempts at annealing at sites of partial complementarity. By contrast, mixing of LiRecT with the two complementary 83-mers added sequentially resulted in a more dominant complex containing 17 or 18 copies of LiRecT and one copy each of the 83+ and 83-oligonucleotides (purple in Fig. <ref type="figure">6</ref>). The stoichiometry of the complex observed by nMS (83/17 or 83/18) is 4.9 or 4.6 bp/monomer, very close to the 5 bp/monomer observed for the cryo-EM structure.</p><p>Moreover, complexes of LiRecT formed on slightly shorter pairs of complementary oligonucleotides (80-and 75-mers), contained 1-2 fewer subunits, as expected for a continuous oligomerization process like that of a helical filament. By contrast, the complexes of LiRecT with just one ssDNA (green in Fig. <ref type="figure">6</ref>) did not get noticeably smaller on the shorter ssDNAs, suggesting a different type of oligomerization process for the ssDNA complex. Whether the cryo-EM structure of the 83-ssDNA complex shown in Supplementary Fig. <ref type="figure">12</ref> has captured the complexes with one copy of ssDNA seen by nMS (green in Fig. <ref type="figure">6</ref>) or the complexes with two copies of ssDNA (blue) is uncertain.</p><p>Based on their apparent length in the 2D class averages (Supplementary Fig. <ref type="figure">12b</ref>) it is likely to be the latter.</p><p>Mutational Analysis. To test the functional significance of the interactions observed in the structure selected residues were mutated to alanine (or other amino acid types) and the effects on DNA binding and annealing were determined. A total of 42 mutations were targeted to 21 residues forming key interactions at four different regions of the structure (Fig. <ref type="figure">7</ref>): the inner strand (W96, Y110, K111, H185, K206, R210, N211, K215), the outer strand (K101, K191, F194), the &#946;1-&#946;2 hairpin that wedges into both strands (V98, Y100, Q107), and the subunit interface (L118, I126, F171, W216, I218). As controls, two of the 42 mutations (K157A, K180A) were introduced at surface exposed residues that are distant from the DNA binding groove and make no interactions in the structure. Three of the protein variants could not be purified, presumably because they disrupted folding and/or solubility. All of these were at the subunit interface (I126H, W216R, and L118A/F171A). The other 39 protein variants could be purified and concentrated (Supplementary Fig. <ref type="figure">17</ref>), consistent with their being properly folded and soluble.</p><p>A gel-shift assay (Fig. <ref type="figure">8</ref>) was used to test the ability of each variant to bind to 50mer ssDNA (Cy3-50mer or Cy5-50mer, lanes labeled "3" or "5") and form the complex with duplex intermediate when the two complementary 50-mers are added sequentially (lanes labeled "35"). Although only one experiment for each variant is shown in Fig. <ref type="figure">8</ref>, all of the experiments were performed multiple times (at least twice), with very similar results. As expected, the two negative control mutations had little to no effect (Fig. <ref type="figure">8a</ref>, lanes next to WT). For the eight inner strand residues (Fig. <ref type="figure">8a</ref>), only one of the single alanine mutants (K111A) noticeably disrupted DNA binding. This may be due to the large network of interactions involved in the interaction, such that truncation of only one interacting side chain has minimal effect. Therefore, three charge reversal mutations (K206E, R210E, K215E) and four double mutations (K206A/K215A, K111A/R210A, K111A/K215A, R210A/K215A) of the four positively charged residues were tested. Indeed, all of the double mutations, and one of the charge reversals (K206E) resulted in little to no detectable DNA binding under the conditions tested.</p><p>Mutations were also introduced at the three residues that contact the outer strand: K101, K191, and F194. The contacts formed with the outer strand are in general much less extensive and more distant than those formed with the inner strand (compare Fig. <ref type="figure">7b</ref> and Fig. <ref type="figure">7c</ref>), and this is reflected in the mutational analysis (Fig. <ref type="figure">8b</ref>). The mutations included alanine mutants (K101A, K191A, and F194A), charge reversals or insertions (K101E, K191E, F194E), and one double mutant (K191A/F194A). Only one of these mutations, K191A/F194A, slightly disrupted binding to the duplex intermediate (lane labeled "35"). Overall, the lack of strong effects of the outer strand mutations is consistent with the lack of strong interactions formed by these residues in the structure (Fig. <ref type="figure">7c</ref>), and with their general lack of conservation in distant RecT/Red&#946; homologs (Supplementary Fig. <ref type="figure">8</ref>).</p><p>Mutations were also introduced at three residues of the &#946;1-&#946;2 hairpin that wedge into the bases of the duplex to separate them: V98, Y100, and Q107 (Fig. <ref type="figure">7d</ref> and Fig. <ref type="figure">8c</ref>). Six single mutations (V98A, V98W, Y100A, Y100E, Q107A, and Q107H) had minimal effects, although V98A did have a noticeable reduction in binding to duplex intermediate (lanes labeled "35") as compared to ssDNA (lanes labeled "3" and "5").</p><p>Such behavior, a specific defect for binding duplex intermediate (with normal binding to ssDNA), would be expected for a wedge mutation. However, double (V98A/Y100A) and triple mutants (V98A/Y100A/Q107A) were purified and characterized, and had minimal effects. V98 is conserved as Val, Ile, or Leu in distant homologs (Supplementary Fig. <ref type="figure">8</ref>), and it may be that mutation to Ala or Trp does not alter the interaction enough to have a significant effect. Y100 on the other hand is conserved as Tyr or Phe and stacks against the outer strand base of every fifth nucleotide (Fig. <ref type="figure">7d</ref>). The mild effects of the Y100A mutation are thus surprising. Q107 makes a more subtle interaction with the inner strand base and is not conserved in distant homologs. Interestingly, a relatively close contact (3.5&#197;) is formed between the backbone amide of K101 at the tip of the &#946;1-&#946;2 hairpin and a backbone phosphate of the outer strand in this region of the structure (top of Fig. <ref type="figure">7d</ref>).</p><p>It may be that the &#946;1-&#946;2 hairpin secondary structure element as a whole, as opposed to specific side chain interactions, is important for the proposed function in clamping down on the duplex to consolidate annealing. If the outer strand is not drawn in fully via complementary base pair interactions, the duplex would likely be too wide to allow the &#946;1-&#946;2-hairpin to clamp down on it.</p><p>Finally, mutations were introduced to five apolar residues that make contacts at the lower portion of inter-subunit interface: L118, I126, F171, W216, and I218 (Fig. <ref type="figure">7e</ref> and Fig <ref type="figure">8d</ref>). As mentioned above, four of the mutations to the subunit interface disrupted folding and/or solubility, suggesting that this region of the structure is particularly sensitive to mutation (all of these were either charge insertions or double mutations). For those variants that could be purified, single mutations to alanine generally had minimal effects on DNA binding. However, introduction of a negative charge at the interface in the form of the I128E mutation disrupted DNA binding almost completely. A small amount of aggregates that stayed in the gel well were seen for the complex of I128E with duplex intermediate (lane labeled "35"). The W216A mutation was also combined with mutation of a neighboring residue that contacts the inner DNA strand in the K215A/W216A double mutant (Fig. <ref type="figure">7e</ref>). This disrupted DNA binding completely.</p><p>Collectively, the mutational analysis generally supports the interactions seen in the structure, but due to the large network of interactions involved, particularly at the inner strand and the subunit interface, stronger mutations such as charge reversals or double-mutations are generally needed to disrupt DNA binding. Some of the mutations, in particular Y100E, K101E, and K191E, reduced the migration of the ssDNA complexes, or split them into two bands (K191A). While this could conceivably be due to the effects of the extra negative charge on electrophoresis, other charge reversal mutants (R210E, K215E) did not exhibit this behavior, and some mutations not affecting charge (L118A) did. Conceivably, the differences in mobility, which were most evident for the ssDNA complexes, could be related the two different-sized complexes that were seen by nMS (green and blue in Fig. <ref type="figure">6</ref>).</p><p>Given the nature of the single-strand annealing reaction, where ssDNA-binding and annealing can be biochemically separated, it should in principle be possible to design mutations that specifically disrupt formation of the complex with duplex intermediate, without disrupting the complex with ssDNA substrate. Two of the mutants, K191A/F194A and V98A, show signs of this behavior, although additional mutations and more quantitative analysis will be needed to confirm this. Our subunit interface mutations were targeted to the core of the interface underneath the DNA-binding grove, at the Cterminal lobes of the LiRecT monomers, as this forms the bulk of the interface. Future experiments will target the N-terminal lobes above the DNA-binding groove, which are mobile in the complex with ssDNA but clamp down on the duplex when the complementary strand is bound.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Discussion</head><p>Using gel-shift assays with 33-mer and 83-mer oligonucleotides, Radding and colleagues discovered over 20 years ago that &#61548;-Red&#61538; exhibits unusual DNA binding properties: it binds weakly to ssDNA, not at all to pre-formed dsDNA, but tightly to a duplex intermediate of annealing formed when two complementary oligonucleotides are added to the protein sequentially <ref type="bibr">(19)</ref>. They referred to this complex as an "intermediate" of annealing, rather than as a "product", presumably because the DNA remained tightly bound to the protein. These experiments did not inform on the conformation of the bound DNA, and whether it was close to B-form or adopted some other conformation remained unknown. Our structure of LiRecT now reveals that the conformation is indeed quite distinct from B-form in being highly extended and completely unwound. Exactly where this conformation of DNA duplex lies along the energetic landscape of protein-mediated annealing (i.e. if it is a transition state or an intermediate), and whether or not it is a special conformation of DNA that is fundamental to annealing and common to all RecT/Red&#61538; family members remains to be determined. Shortly after the unique DNA binding properties of &#61548;-Red&#61538; were discovered oligomeric structures of &#61548;-Red&#61538; were visualized that closely paralleled the different DNAbound states: rings for binding to ssDNA and helical filaments for binding to annealed duplex <ref type="bibr">(20)</ref>. The filaments of LiRecT that we have observed by cryo-EM closely match the filaments of &#61548;-Red&#61538; seen by negative stain EM: they are left-handed, and have similar dimensions and helical parameters. Given that LiRecT and &#61548;-Red&#61538; share limited sequence identity with one another (&lt;15%), the fact that they share a conserved helical filament structure would tend to suggest that the conformation of the duplex intermediate that is bound to them is also conserved.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Egelman and colleagues predicted that the duplex intermediate formed by &#955;-</head><p>Red&#61538; was likely to be bound along the inner surface of the helical filament (though not along the helical axis), based on the observation that it was protected from DNAse I cleavage <ref type="bibr">(19,</ref><ref type="bibr">20)</ref>. Based on data from atomic force microscopy and geometric considerations, Stewart and colleagues proposed an alternative model in which an extended and un-wound DNA duplex spirals around the surface of the protein filament to form a right-handed helix <ref type="bibr">(21)</ref>. The duplex intermediate bound to our structure of LiRecT is also fully un-wound, but binds to a groove that remains on the outer surface of the filament. The fact that the DNA is buried in such a deep and narrow groove, and that its conformation is far from B-form, may explain why it is protected from DNAse I cleavage.</p><p>We have so far not been able to visualize oligomeric rings of LiRecT bound to ssDNA, like the 11-mer rings seen for &#955;-Red&#61538; <ref type="bibr">(20)</ref> and RAD52 <ref type="bibr">(28)</ref><ref type="bibr">(29)</ref><ref type="bibr">(30)</ref>. Our nMS data indicate that LiRecT exists in a monomer-oligomer equilibrium (up to 9-mer) in the absence of DNA, and as a complex of 7 to 10 subunits on a single 83-mer ssDNA.</p><p>Interestingly, in the complexes with a single 83-mer ssDNA, LiRecT does not appear to bind along the full length of the DNA, as it does for the complex with annealed duplex.</p><p>These observations are similar to our previous nMS analysis of &#61548;-Red&#61538; (31), although the latter protein had a higher propensity to form oligomers in the absence of DNA. We favor a model in which RecT/Red&#61538; proteins oligomerize weakly and dynamically on their own, assemble onto ssDNA as clusters of cooperatively bound monomers to form partial rings or filaments, and form more stable helical filaments once the complementary strand is incorporated. The weaker complexes on ssDNA may allow for dynamic sampling with multiple strands of ssDNA until a complementary sequence is found and aligned, at which point the N-terminal lobe of each protein monomer likely clamps down on the duplex to stabilize the complex and consolidate annealing.</p><p>Filaments of both &#955;-Red&#946; and LiRecT can be several helical turns in length, but annealing assays with &#955;-Red&#946; indicate that the minimal length needed for successful annealing in vitro is only 20 bp <ref type="bibr">(21,</ref><ref type="bibr">31)</ref>. Moreover, oligonucleotides as short as 35-mers are routinely functional for Red&#61538; annealing in vivo <ref type="bibr">(11)</ref>. Therefore, we consider it unlikely that long helical filaments of these proteins would form in vivo. Although the helical filament is highly stable, at least as compared to the ssDNA complexes <ref type="bibr">(19,</ref><ref type="bibr">34)</ref>, it would likely disassemble in vivo once the two DNA molecules are spliced together, possibly due to the greater torsional stress of being bound to the middle of a larger DNA duplex, as opposed to at the ends. An alternative possibility is that a DNA helicase or a component of the DNA replication machinery could be involved in removing the protein from the DNA in vivo.</p><p>While our LiRecT cryo-EM structure captures what appears to be an important intermediate of DNA annealing, cellular DNA annealing reactions likely involve interactions with partner proteins. &#955;-Red&#946; forms an interaction with &#955; exonuclease, which resects dsDNA ends to form the 3'-overhang <ref type="bibr">(35)</ref>. This interaction presumably loads the annealing protein directly onto the 3'-overhang as it is being formed, before it can fold into secondary structures. &#955;-Red&#946; also forms an interaction with the host single-stranded DNA binding protein (SSB) <ref type="bibr">(35)</ref>. This interaction presumably directs the initial &#955;-Red&#946;-ssDNA complex to the lagging strand of the replication fork, where it can pair with the complementary target site as it is exposed. Such coordinated interactions are likely to be shared by other RecT/Red&#946; family annealing proteins, including LiRecT.</p><p>Residues 1-33 and 225-271 of LiRecT were not resolved in our 3Dreconstruction of the filament. These residues are however part of a RoseTTAFold model for the LiRecT monomer, as shown in magenta in Supplementary Fig. <ref type="figure">6b</ref>.</p><p>Residues 1-33 form two &#945;-helices, one that is quite long (residues 1-27) and extends away from the core of the monomer, and another that is short (residues 27-33) and forms a right angle with &#945;A. In our reconstruction, there is density for what appears to be a helix preceding &#945;A. Although the density for this helix was not clear enough to model, it appears to pack against &#945;A of the neighboring subunit, and thereby add to the intersubunit contacts. There is no sign of density that would correspond to the long Nterminal &#945;-helix from the RoseTTAFold model.</p><p>By analogy with &#955;-Red&#946;, it is likely that the extra residues at the C-terminal end (225-271) fold into a small helical domain for forming interactions with partner proteins, including the host single-stranded DNA-binding protein (SSB) <ref type="bibr">(35)</ref>. In the RoseTTAFold model, residues 242-271 extend away from the filament to possibly form such a domain, but residues 220-238 form an &#945;-helix that packs against the &#946;1-&#946;2 hairpin and would overlap with the DNA if it were bound. The placement of this helix is not consistent with DNA binding, but it could conceivably adopt this position in the LiRecT monomers before they assemble onto the DNA. Further studies will be needed to resolve these issues.</p><p>The structure confirms that the RecT/Red&#61538; family of annealing proteins share a common core fold with RAD52. The two proteins use this fold to bind to the first ssDNA in similar ways, with equivalent sets of residues contacting the DNA from common secondary structural elements (&#945;2 and &#945;3). Moreover, the proteins use approximately the same portions of their monomers for inter-subunit packing, suggesting that their oligomers could be related. However, RAD52 has so far only been observed to form rings, and has not been seen to form helical filaments. RAD52 also exhibits somewhat different DNA-binding properties from &#955;-Red&#61538; in binding with higher affinity to ssDNA and to pre-formed dsDNA <ref type="bibr">(36)</ref>. Furthermore, a distinct complex of RAD52 with a duplex intermediate of annealing like those of &#955;-Red&#61538; and LiRecT has not yet been observed.</p><p>Nonetheless, the DNA binding grooves on the LiRecT and RAD52 structures are formed by a common set of secondary structural elements, and are similarly deep and narrow, suggesting that a complex of RAD52 with two strands of complementary DNA bound simultaneously could very well be formed. The existence of such a complex would favor a cis mechanism of annealing in which the two DNA strands are bound to the same protein oligomer as they are annealed to one another, as opposed to a trans mechanism in which annealing is mediated by interaction of two separate RAD52-ssDNA complexes.</p><p>Although human RAD52 has been widely considered to exist as stable oligomeric rings, yeast RAD52 is expressed at only nanomolar concentrations in vivo <ref type="bibr">(37)</ref>, and human RAD52 is largely monomeric at sub-micromolar concentrations in vitro <ref type="bibr">(38)</ref>. Thus, non-ring forms of RAD52 could still be relevant to its mechanism. Some features of the LiRecT-DNA complex are remarkably similar to other types of DNA recombination proteins. The 1.5x extended conformation of DNA and the 5 bp repeating pattern of extension are similar to the triplet-repeating conformation of DNA bound to E. coli RecA protein <ref type="bibr">(39)</ref>. The LiRecT-DNA complex also shares some remarkably similar features with a multi-subunit complex of E. coli Cascade bound to an RNA-DNA duplex hybrid <ref type="bibr">(40)</ref>. In Cascade, the duplex is bound to a very similar groove along the outer surface of a right-handed helical assembly of subunits. The duplex is similarly extended and un-wound, bound in a pattern that repeats every 6 bp steps due to a similar &#946;-hairpin insertion, and has the first strand added (RNA) in the deepest part of the groove and the second strand added (DNA) at the outer part of the groove. These similarities of LiRecT with functionally (but not structurally) related proteins point to fundamental principles of DNA transactions that are still being unraveled.</p><p>While our manuscript was in revision, the cryo-EM structure of an N-terminal fragment of &#955;-Red&#946; (residues 1-177) corresponding to its DNA-binding domain was reported in this journal <ref type="bibr">(41)</ref>. The complex was formed with complementary 27-mer oligonucleotides and adopted continuously stacked left-handed helical filaments. The filaments are much more loosely wound than the LiRecT filaments reported here, as there are 27 subunits per helical turn instead of 10. However, the dimensions of the outer DNA-binding groove and the conformation of the bound DNA duplex are very similar. The fact that two distantly related proteins bind to such a similar conformation of duplex intermediate supports the fundamental importance of the structures to understanding the mechanism of protein-mediated DNA annealing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Methods</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Materials. The vendors and catalog numbers for chemicals and other materials used in</head><p>this study are shown in Supplementary Table <ref type="table">3</ref>. All oligonucleotides used in this study were purchased HPLC-purified from Integrated DNA Technologies, dissolved in ddH2O, and stored at -20&#176;C. Their full sequences are shown in Supplementary Table <ref type="table">4</ref>.</p><p>Protein expression and purification. The gene expressing LiRecT (UniProtKB -Q92FL9) was PCR amplified from Listeria innocua CLIP 11262 genomic DNA (ATCC BAA-680) and cloned into pET28b between the NdeI and BamHI restriction sites to express a protein with an N-terminal 6His-tag and a site for thrombin cleavage. The protein was expressed in BL21(AI) E. coli cells (Invitrogen) in 6 x 1L cultures at 37&#176;C, grown to an optical density at 600 nm of 0.65, and induced by 1 mM IPTG and 0.2% arabinose. At four hours post-induction, the cells were harvested by centrifugation, resuspended in 60 ml of Buffer A (50 mM NaH2PO4, 300 mM NaCl, 10 mM imidazole, pH 8.0) and frozen at -80&#176;C. After thawing, lysozyme (1 mg/ml), PMSF (0.1 mg/ml), leupeptin and pepstatin (1 &#956;g/ml each) were added and incubated for 60 min on ice. The cells were then sonicated on ice, centrifuged at 38,000 x g for 3 x 30 min, and the final supernatant was loaded on to a 2 x 5 ml HisTrap Fast Flow column (Cytiva) at 0.5 ml/min. The column was washed with 30 ml of Buffer A, 200 ml of Buffer A containing 30 mM imidazole, and eluted with a 200 ml gradient of 30-500 mM imidazole in Buffer A.</p><p>After SDS-PAGE analysis, pooled fractions were mixed with 100 units of Thrombin (Cytiva), dialyzed at room temperature into Buffer B (20 mM NaH2PO4, 1500 mM NaCl, pH 7.4), and loaded back onto the HisTrap FF column. The flow through was collected, dialyzed at 4&#176;C into Buffer C (20 mM Tris pH 8.0) and loaded onto a 2 x 5 ml HiTrap Q FF column (Cytiva) at 1 ml/min. After washing with Buffer C for 30 ml, the protein was eluted with a 100 ml gradient to Buffer C plus 1M NaCl. Pooled fractions were dialyzed into Buffer D (20 mM Tris, 1 mM DTT, pH 8.0), concentrated to 50 mg/ml (Vivaspin 20, 10 kDa MWCO), and stored at -80&#176;C in 50 &#956;l aliquots. Protein concentration was determined by O.D. at 280 nm using an extinction coefficient of 43,890 M -1 cm -1 , which was determined from the amino acid sequence, which has 5 tryptophan residues.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>DNA binding assay.</head><p>A gel shift DNA binding assay used two complementary 50-mer oligonucleotides labeled at the 5'-end with either Cy3 or Cy5. The indicated concentration (5 or 3.6 &#956;M) of Red&#946; or LiRecT in PBS (or cryo-EM buffer defined below) was mixed with 25 &#956;M (nt) of the indicated oligonucleotide and incubated at 37&#176;C for 15 min. For some samples as indicated on the gel (lanes labeled "35", "ad" or "nc"), a 1.5 second blot time, and 0 blot force. Ted Pella 595 filter paper (product # 47000-100) was used for blotting.</p><p>Cryo-EM Data Acquisition. For the complex with 83-mer annealed duplex, images were collected on a 300 keV Titan Krios G3i electron microscrope (Thermo Fisher Scientific) operating in nanoprobe EFTEM mode with 50 &#956;m C2 aperture, 100 &#956;m objective aperture, a Gatan BioContinuum energy filter (20 eV slit width, zero energy loss), a Cs corrector, and a Gatan K3 direct electron detector operating in counting mode. Automated data collection was performed in EPU with defocus values ranging from -1 to -3.5 &#181;m at a magnification of 81,000x and a pixel size of 0.899 &#197; (non-superresolution). The dose rate was adjusted to 24.28 e-/&#197; 2 /s with an exposure time of 2.7 s split into 36 fractions to achieve a total dose of 66 e-/&#197; 2 . A total of 2038 movies were collected. For the complex with 83-ssDNA, the same settings were used, except for the following: the data were collected in super-resolution mode such that the pixel size was 0.4495 &#197;, the does rate was adjusted to <ref type="bibr">22</ref>.80 e-/&#197; 2 /s with an exposure time of 2.83 s split into 45 fractions for a total dose of 65 e-/&#197; 2 , and 1619 movies were collected.</p><p>Cryo-EM Data Processing. For the data for the complex with annealed duplex, movies were imported into cryoSPARC v2.15.0 (42) for single particle analysis. Patch motion correction was implemented with a 3&#197; maximum alignment resolution and a B-factor of 500. Patch CTF estimation was implemented with an amplitude contrast of 0.1. From the motion-and CTF-corrected micrographs, approximately 1000 particles were manually picked and used for one round of 2D classification. Six 2D class averages representing different particle orientations were chosen and used as templates for automated particle picking, which resulted in approximately 1,100,000 particles. Particles were extracted with a box size of 252 &#197; and put through three rounds of 2D classification to result in 391,275 cleaned particles. The cleaned particles were used to generate three initial models with ab-initio reconstruction, the best of which (271K particles) was refined in homogenous refinement to yield a 3D reconstruction with an FSC gold standard resolution of 3.41 &#197; (tight mask), or 4.3&#197; (no mask). After polishing, the resulting 3D reconstruction showed clear density for protein backbone, side chains, and two strands of DNA including bases. The data for LiRecT with 83-mer ssDNA were processed in the same manner to result in 180,965 cleaned particles and a final resolution of 4.79&#197;. This resolution is likely over-estimated however as the FSC curve was oscillating. The resulting map reveals clear secondary structure feature but very few side chains.</p><p>Model Building and Refinement. For the complex with annealed duplex, the two unmasked half maps from cryoSPARC were input into the RESOLVE procedure for density modification in PHENIX version 1.20.1-4487 <ref type="bibr">(43)</ref> which improved the resolution by 0.22&#197; from 3.81&#197; (FSCref=0.5) to 3.59&#197; (FSCref=0.5). A model of one protein monomer containing residues 34-224 (out of 271 total) was built into the central portion of the filament with COOT version 0.8.7 <ref type="bibr">(44)</ref>, and then transformed iteratively into density for nine neighboring subunits using CHIMERA version 1.13.1 <ref type="bibr">(45)</ref>. Additional subunits towards the ends of the filament were visible in the reconstruction, but not included in the final model, as the density for these regions was progressively weaker. The 3D reconstruction also showed clear density for 48 bp of DNA duplex at the central portion of the filament, which was also built using COOT. Once a 10-subunit filament was built, the NCS operators were determined from the structure using Find NCS in PHENIX, and then used for 10-fold NCS averaging in Resolve, which further increased the resolution  <ref type="table">S1</ref>. For the structure with 83-mer ssDNA, the resolution of the reconstruction did not enable the model to be built from scratch as very few side chains were visible, but six LiRecT subunits from the structure with annealed duplex could be auto-fit into density using CHIMERA, and additional subunits could be fit using PHENIX (dock_in_map). The density corresponding to the N-terminal lobes of each monomer (residues 34-109) was weak and these residues of each subunit were deleted from the model. The final model consisting of residues 110-221 of 8 LiRecT subunits was refined in PHENIX by rigid body refinement only. Structural figures were prepared using PyMOL version 2.5 <ref type="bibr">(46)</ref>. Atomic coordinates and maps have been deposited in PDB and EMDB under accession codes 7UB2 and EMD-26434 for the complex with 83-mer annealed duplex, and 7UBB and EMD-26437 for the complex with 83-mer ssDNA).</p><p>Native Mass Spectrometry. LiRecT protein was buffer exchanged into 100 mM ammonium acetate pH 7 (unadjusted) using Micro BioSpin P6 spin columns (Bio-Rad Laboratories, Hercules, CA, USA). All ssDNAs were dialyzed into 100 mM ammonium acetate with Pierce 96-well microdialysis devices with 3.5K MWCO (Thermo Fisher Scientific). For preparation of LiRecT-DNA complexes, LiRecT was diluted to the experimental concentrations indicated, and then the first ssDNA was added at the indicated concentration based on nucleotides (nt) per monomer of LiRecT, and incubated at 37&#176;C for at least 15 min. For complexes with annealed duplex, the second complementary ssDNA was then added and incubated for an additional 15 min. Samples (3 -5 &#956;l) were directly loaded into nanoESI emitters that were pulled in-house from borosilicate filament capillaries (OD 1.0 mm, ID 0.78 mm, Sutter Instrument) using a P-97 Flaming/Brown Micropipette Puller (Sutter Instrument). Experiments were performed on a Thermo Scientific Q Exactive Ultra-High Mass Range (UHMR) mass spectrometer from Thermo Fisher that was modified to allow for surface-induced dissociation (SID, not used in this work) similar to a previously described modification <ref type="bibr">(47)</ref>. The same instrument settings were used as described previously <ref type="bibr">(47)</ref>. Ion activation was necessary for improved transmission and de-adducting of ions to resolve species at higher m/z. For this, in-source trapping (IST) of -10 V and higher energy collisional dissociation (HCD) of 90 V was used for the LiRecT plus DNA mixtures. All data were deconvolved using UniDec V4.4 <ref type="bibr">(48)</ref>. A range of deconvolution settings was initially surveyed. The settings optimized for LiRecT plus DNA mixtures were the following: 2000 to 16000 m/z, charge range of 1 to 70, mass range of 10 to 800 kDa, sample mass every 10 Da, split Gaussian/Lorentzian, peak FWHM 3 or 4 Th, artifact suppression 40, charge smooth width 2.0, point smooth width 2, and native charge offset -20 to 10 or 20. The use of manual mode to assign a fraction of the peaks with charge states was needed to reduce artifacts. The resulting deconvolutions were plotted as relative signal intensities.</p><p>Mutational Analysis. Structure-guided mutations were introduced into the pET28b-LiRecT expression plasmid by the QuikChange TM method (Agilent technologies). The protein variants were expressed from BL21-AI cells and purified by a previously described small-batch version of the method described above <ref type="bibr">(49)</ref>. Briefly, cells from 50 ml cultures were re-suspended in 3.0 ml of Buffer A and frozen at -80&#176;C. Cell suspensions were thawed and incubated for 30 min on ice with 1 mg/mL lysozyme, 1 &#181;g/mL leupeptin, 1 &#181;g/mL pepstatin, and 1 mM PMSF. Cells were then sonicated using a micro-tip, clarified by centrifugation at 38,000 x g for 30 min, and 2.1 ml of the soluble supernatant was loaded onto a Qiagen Ni-spin column (Cat. # 31014) that had been prewet with 600 &#956;l of Buffer A. The columns were washed four times with 500 &#956;l of Buffer A containing 30 mM imidazole, and eluted four times with a total of 1.8 ml of Buffer A containing 500 mM imidazole (2 times with 200 &#956;l followed by two times with 700 &#956;l).</p><p>Pooled fractions (1.8 ml total) were buffer exchanged into Buffer B using PD-10 desalting columns (Cytiva, Cat. # 170851-01), concentrated to 1-8 mg/ml using an Amicon Ultra-4 centrifugal filter with 10 kDa MWCO (MilliporeSigma Cat. # UFC8010), and frozen in 50 &#956;l aliquots at -80&#176;C. The final purified proteins (Supplementary Fig. <ref type="figure">17</ref>) retain an extra 20 N-terminal amino acids from the expression vector, which had minimal if any effect on DNA binding. DNA binding assays for each variant were performed as described above, where the WT protein used for comparison was purified by the same small-batch method described for the variants.  a,b Monomers of LiRecT (a) and RAD52 (b) are shown in similar orientations with their common core folds in cyan and extraneous segments in green. The DNA backbones are drawn with the inner strand in yellow and the outer strand in orange (for LiRecT only). The DNA binding groove is formed by the 2 central &#61537;-helices (&#945;2, &#945;3), the &#946;1-&#946;2 hairpin, and &#61537;E-&#61537;D (LiRecT) or &#946;3-&#946;5 (RAD52). RAD52 is drawn with coordinates from PDB accession ID 5XRZ (30). c,d,e Close-up views of LiRecT interactions with the inner strand (c), outer strand (d), and &#61538;1-&#61538;2 hairpin (e). Hydrogen bonds within 3.5&#197; and ion pairs within 6&#197; are shown as dotted lines.  oriented vertically. b Top view showing that the inner strand is always closer to the filament axis than the outer strand. c Close-up view of a 10 bp segment from the central portion of the filament. The base pairs are spaced by 3.8&#197; except at every 5 th bp step where they are opened to 9&#197; by insertion of the &#946;1-&#946;2 hairpin. b 10 bp of B-form DNA drawn to scale for comparison. Coordinates of B-DNA are from PDB code 1BNA (50). Notice that the duplex intermediate from the LiRecT filament is highly extended and unwound, but still forms normal Watson-Crick base pairs, as indicated by the dotted lines.  <ref type="table">2</ref>). The coloring corresponds to the DNA present in each complex: black to 0 ssDNA, green to 1 ssDNA, blue to two copies of the same ssDNA (2 ssDNA), and purple to one copy each of two complementary ssDNAs (dsDNA). Source data are provided in Supplementary Table <ref type="table">2</ref>. strand (yellow) in panel B are much more extensive than those with the outer strand (orange) in panels C and D. In panel E the labels for the residues and secondary structures of the right subunit are underlined. </p></div></body>
		</text>
</TEI>
