<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Accessing isotopically labeled proteins containing genetically encoded phosphoserine for NMR with optimized expression conditions</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>12/01/2022</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10422554</idno>
					<idno type="doi">10.1016/j.jbc.2022.102613</idno>
					<title level='j'>Journal of Biological Chemistry</title>
<idno>0021-9258</idno>
<biblScope unit="volume">298</biblScope>
<biblScope unit="issue">12</biblScope>					

					<author>Cat Hoang Vesely</author><author>Patrick N. Reardon</author><author>Zhen Yu</author><author>Elisar Barbar</author><author>Ryan A. Mehl</author><author>Richard B. Cooley</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Edited by Wolfgang PetiPhosphoserine (pSer) sites are primarily located within disordered protein regions, making it difficult to experimentally ascertain their effects on protein structure and function. Therefore, the production of 15 N-(and 13 C)-labeled proteins with site-specifically encoded pSer for NMR studies is essential to uncover molecular mechanisms of protein regulation by phosphorylation. While genetic code expansion technologies for the translational installation of pSer in Escherichia coli are well established and offer a powerful strategy to produce sitespecifically phosphorylated proteins, methodologies to adapt them to minimal or isotope-enriched media have not been described. This shortcoming exists because pSer genetic code expansion expression hosts require the genomic ΔserB mutation, which increases pSer bioavailability but also imposes serine auxotrophy, preventing growth in minimal media used for isotopic labeling of recombinant proteins. Here, by testing different media supplements, we restored normal BL21(DE3) ΔserB growth in labeling media but subsequently observed an increase of phosphatase activity and mis-incorporation not typically seen in standard rich media. After rounds of optimization and adaption of a high-density culture protocol, we were able to obtain ≥10 mg/L homogenously labeled, phosphorylated superfolder GFP. To demonstrate the utility of this method, we also produced the intrinsically disordered serine/ arginine-rich region of the SARS-CoV-2 Nucleocapsid protein labeled with 15 N and pSer at the key site S188 and observed the resulting peak shift due to phosphorylation by 2D and 3D heteronuclear single quantum correlation analyses. We propose this cost-effective methodology will pave the way for more routine access to pSer-enriched proteins for 2D and 3D NMR analyses.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Phosphorylation, the most common type of post-translation modification, is an essential protein regulatory mechanism in eukaryotic cells <ref type="bibr">(1)</ref><ref type="bibr">(2)</ref><ref type="bibr">(3)</ref>. Dysregulation of phosphorylationdependent signaling systems is linked to numerous disease pathologies as these post-translational modifications play key roles in cellular processes such as protein synthesis, signal transduction, and cell development <ref type="bibr">(4)</ref><ref type="bibr">(5)</ref><ref type="bibr">(6)</ref>. More than twothirds of proteins in the human proteome undergo reversible phosphorylation and of these proteins, nearly 80% were identified at serine residues <ref type="bibr">(7)</ref>. Most phosphoserine (pSer) sites are located within flexible or disordered regions of proteins <ref type="bibr">(8,</ref><ref type="bibr">9)</ref>, where they serve as regulatory switches to modulate conformational dynamics, function, and allosteric interactions <ref type="bibr">(10)</ref><ref type="bibr">(11)</ref><ref type="bibr">(12)</ref><ref type="bibr">(13)</ref>. NMR spectroscopy is the ideal technique for probing dynamics and structure of intrinsically disordered regions (IDRs)/intrinsically disordered proteins at the molecular level <ref type="bibr">(14)</ref>. Despite the ubiquitous nature of phosphorylation, few NMR studies focus on understanding the molecular consequences of IDR/intrinsically disordered protein phosphorylation. This is, in large part, caused by a lack of standardized and routine methods for synthesizing isotopically labeled, sitespecifically phosphorylated proteins for NMR characterization.</p><p>The study of phospho-proteins by NMR requires that they be isotopically enriched with 15 N and/or 13 C and be homogenously phosphorylated at the targeted site(s). Standard expression strains of Escherichia coli (e.g., BL21(DE3)) can biosynthesize all 20 natural amino acids from fundamental carbon and nitrogen building blocks so that protein expression in fully-defined minimal media containing 13 C-sugars and 15 NH 4 Cl provides a convenient strategy to produce isotopically labeled proteins. Site-specific phosphorylation of target proteins, on the other hand, continues to pose a challenge because the required kinase(s) is not always known, they may lack required specificity and they may not be easily isolated in a functional state for in vitro utility <ref type="bibr">(14)</ref><ref type="bibr">(15)</ref><ref type="bibr">(16)</ref>. Phosphomimetic mutations (Ser or Thr to Asp/Glu) are commonly introduced at the site of phosphorylation to overcome this issue, however, Asp or Glu do not faithfully recapitulate the geometry nor the charge density <ref type="bibr">(17)</ref><ref type="bibr">(18)</ref><ref type="bibr">(19)</ref> and so they commonly misinform on the functional effects of authentic phosphorylation <ref type="bibr">(20)</ref><ref type="bibr">(21)</ref><ref type="bibr">(22)</ref><ref type="bibr">(23)</ref><ref type="bibr">(24)</ref><ref type="bibr">(25)</ref>.</p><p>Genetic code expansion (GCE) has emerged as a leading technology for the production of phosphorylated proteins because it allows site-specific, homogenous, and efficient translational incorporation of phosphorylated amino acids into any protein (Fig. <ref type="figure">1A</ref>) <ref type="bibr">(26)</ref><ref type="bibr">(27)</ref><ref type="bibr">(28)</ref><ref type="bibr">(29)</ref>. In 2015, a high-efficiency GCE system was developed to translationally install pSer in response to an amber (TAG) stop codon, yet to date, this system has not been adopted to produce isotopically labeled phosphorylated proteins <ref type="bibr">(26)</ref>. Reasons for this have not been articulated in the literature, but we hypothesized they were rooted in the fact that pSer GCE expression systems utilize serine auxotrophic strains for phospho-protein expression, which means they cannot grow in minimal media <ref type="bibr">(26,</ref><ref type="bibr">27,</ref><ref type="bibr">29,</ref><ref type="bibr">30)</ref>. Serine auxotroph expression hosts are used because charged, phosphorylated amino acids like pSer do not traverse from the media into the cell effectively, resulting in low bioavailability of the free amino phospho-amino acid <ref type="bibr">(26,</ref><ref type="bibr">27,</ref><ref type="bibr">(29)</ref><ref type="bibr">(30)</ref><ref type="bibr">(31)</ref>. By deleting the serB gene, which hydrolyzes pSer to produce serine as the last step of serine biosynthesis, pSer accumulates inside the cell, providing a sufficient pool of pSer to feed the GCE machinery (31) (Fig. <ref type="figure">1A</ref>). Serine supplementation should overcome this issue of auxotrophy, though isotopically labeled serine is expensive and serine feedback inhibition of SerA shuts down production of pSer (Fig. <ref type="figure">1A</ref>) <ref type="bibr">(32)</ref><ref type="bibr">(33)</ref><ref type="bibr">(34)</ref>. Intracellular levels of pSer in WT E. coli strains are remarkably low having an intact SerB <ref type="bibr">(31)</ref>, and even if pSer media supplementation could overcome this issue, isotopically labeled pSer is not commercially available to our knowledge.</p><p>Needed therefore is an expression methodology that merges existing pSer GCE systems with methods for expressing isotopically labeled proteins. Here, we overcome these challenges to formulate an efficient and low-cost expression strategy that is sufficient for production of 15 N-labeled proteins with site-specific pSer incorporated using serine auxotroph BL21(DE3) &#916;serB as the expression host. This method can be easily adapted for 13 C labeling and should accelerate access to phosphorylated proteins for NMR structural biology projects.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Optimization of BL21(DE3) &#916;serB growth in labeling media</head><p>We use here the E. coli expression host BL21(DE3) &#916;serB, with the pSer phosphatase serB deleted to increase the intracellular levels of pSer <ref type="bibr">(31)</ref> and thus, improve pSer incorporation into recombinant isotopically labeled protein <ref type="bibr">(26,</ref><ref type="bibr">28)</ref>. Our first goal was to compare the growth rate between BL21(DE3) WT and BL21(DE3) &#916;serB hosts in unlabeled minimal media without antibiotics and identify key supplements required for optimal BL21(DE3) &#916;serB growth (Fig. <ref type="figure">1B</ref>). We tracked optical density (OD) at 600 nm (OD 600 ) over 24 h and confirmed WT BL21(DE3) cells grew robustly in the minimal media, while BL21(DE3) &#916;serB would not grow unless supplemented with 2 mM serine, 0.2% (w/v) Celtone base powder, or both (Fig. <ref type="figure">1B</ref>). Celtone base powder (referred to hereafter as Celtone) is an algal hydrolysate containing a mix of amino acids and used here because it is available with 13 C and/or 15 N enrichment. Interestingly, serine supplementation alone to BL21(DE3) &#916;serB did not fully rescue growth phenotype to those of WT BL21(DE3), as indicated by an impaired growth rate and a low final cell density (final OD 600 1 versus 5, respectively). On the other hand, the addition of Celtone alone or in combination with serine improved growth rates and final cell densities that were comparable growth to BL21(DE3) WT. With these data, we reasoned that a minimal media supplemented with either both 2 mM serine and 0.2% (w/v) Celtone, or just Celtone alone, could be used for phospho-protein production with further expression optimization.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Optimization of expression conditions for phosphorylated protein production</head><p>Having identified media for robust BL21(DE3) &#916;serB growth, we set out to evaluate pSer protein expression using the high efficiency GCE system created by Chin et al. <ref type="bibr">(26)</ref>. This system utilizes the pKW2-EFSep plasmid which expresses the pSer tRNA synthetase variant SepRS-2, the amber codon suppression tRNA Sep-tRNA CUA (B4), and an EFTu variant enabling efficient delivery of pSer-aminoacylated tRNA to the ribosome, EF-Sep. To easily evaluate phospho-protein expression parameters, we expressed the fluorescent reporter super folder GFP (sfGFP) containing an amber TAG codon at position N150 (sfGFP-150TAG) from the pRBC plasmid <ref type="bibr">(28)</ref>. Previously, we found that rich auto-induction media (called ZY-AIM, Table <ref type="table">1</ref>) provided maximal levels of homogenously phosphorylated protein with this pSer GCE system <ref type="bibr">(28)</ref>, and so here, we first tested auto-induction expression strategies with Celtone-and serine-supplemented media amenable to isotopic labeling (Minimal AIM, Table <ref type="table">2</ref>, Method 1 in Fig. <ref type="figure">2</ref>). In parallel, we also tested manual induction expression strategies in a minimal media similarly supplemented with Celtone and serine (MIM-1, Table <ref type="table">2</ref>, Method 2 in Fig. <ref type="figure">2</ref>). For these expressions, starter cultures were grown overnight (16-18 h) in a rich non-inducing media (ZY-NIM, Table <ref type="table">1</ref>). Cells were then pelleted, resuspended in their respective expression minimal media. Protein expression yields were quantified by whole-cell fluorescence. The homogeneity of phosphorylation was assessed by Phos-tag gel electrophoresis of purified sfGFP proteins. With Phos-tag gels, phosphorylated proteins migrate slower than nonphosphorylated proteins permitting near-quantitative evaluation of protein phosphorylation status <ref type="bibr">(35)</ref>. For comparison, phosphorylated sfGFP was also expressed in rich ZY-AIM <ref type="bibr">(28)</ref>.</p><p>All expression cultures grown at 37 C grew to a similar OD 600 of 4 to 5 and with total cell fluorescence about 30% higher in the auto-induction expression, corresponding to 1 to 2 mg of protein per liter culture (Fig. <ref type="figure">3A</ref>). However, only 40% of purified sfGFP expressed from auto-induction and 60% from manual-induction were phosphorylated, in stark contrast to &gt;95% of the protein being phosphorylated when expressed in and purified from rich ZY-AIM media (Fig. <ref type="figure">3C</ref> lanes 1, 7 and 5 respectively). We assumed the source of nonphosphorylated protein could come from the following: (1) near-cognate suppression by endogenous tRNAs at encoded amber codons, (2) protein dephosphorylation from phosphatase activity, and (3) mis-aminoacylation of the Sep-tRNA CUA (B4) by endogenous aminoacyl-tRNA synthetases. Near cognate suppression of amber stop codons is generally insignificant when using expression hosts such as BL21(DE3) &#916;serB that contain Release Factor 1 (RF1), the E. coli protein responsible for terminating translation at amber codons <ref type="bibr">(28,</ref><ref type="bibr">36)</ref>. We next considered the possibility that the our GCE tRNA, Sep-tRNA CUA (B4), was mis-aminoacylated by endogenous synthetases in these conditions where overall incorporation efficiency was low <ref type="bibr">(27)</ref>. We addressed this by replacing Sep-tRNA CUA (B4) with the Sep-tRNA CUA v2 (Fig. <ref type="figure">3B</ref>), a variant containing mutations in its acceptor stem that minimize mis-acylation by endogenous synthetases <ref type="bibr">(27)</ref>. Using this strategy, we re-expressed sfGFP-150TAG in the same auto-and manual-induction media at 37 C. While expression yields were comparable with Sep-tRNA-CUA v2 , measurable improvements in incorporation fidelity were observed, with 80% of the purified sfGFP being phosphorylated (Fig. <ref type="figure">3C</ref>, lanes 2 and 8). The remaining nonphosphorylated protein was identified by mass spectrometry to be exactly 80 Da lighter, consistent with hydrolysis of pSer by cellular phosphatases during expression (Fig. <ref type="figure">3D</ref>). We found that by conducting the expressions at 18 C instead of 37 C, cellular phosphatase activity was notably decreased, with &gt;90% of the purified sfGFP being phosphorylated using Methods 1 and 2 with the Sep-tRNA CUA v2 (Fig. <ref type="figure">3B</ref>, lanes 4 and 10). Having improved homogeneity of phospho-protein expression, we next set out to optimize expression yield.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Optimizing phospho-protein yield for NMR applications</head><p>To this point, all sfGFP protein expressions were cultured in minimal media supplemented with 2 mM serine and 0.2% (w/v) Celtone, so we hypothesized that removing serine would improve protein yield without compromising protein phosphorylation by removing feedback inhibition of SerA (Fig. <ref type="figure">1A</ref>) <ref type="bibr">(32)</ref><ref type="bibr">(33)</ref><ref type="bibr">(34)</ref>. For autoinduction expression (Method 1), we observed no notable defect in cell growth (Fig. <ref type="figure">4A</ref>) nor increase in protein production when exogenous serine was omitted (Fig. <ref type="figure">4B</ref>). On the other hand, with manual induction (Method 2), cell growth was similar (Fig. <ref type="figure">4A</ref>) but phosphorylated protein yield increased 2-fold (Fig. <ref type="figure">4B</ref>) compared to the same cultures containing serine and 30% higher than autoinduction expression (Method 1). In all cases, &gt;90% of the purified protein was phosphorylated (Fig. <ref type="figure">4C</ref>). Given the expense of isotopically labeled ( 15 N and/or 13 C) serine and that its omission does not have adverse side effects, serine was left out of subsequent expressions. e 50x 5052, per 500 mL: 12.5 g &#945;-D-glucose, 50 g &#945; -lactose, 125 mL glycerol (or 250 mL of 50% glycerol, which is easier to measure). Note: to get the lactose to dissolve, heat gently in microwave. Once dissolved, lactose will stay in solution indefinitely. f Not essential for ZY-NIM, but can help cell growth, as previously described <ref type="bibr">(28)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Methods to access isotope-enriched phosphorylated proteins</head><p>For these expressions, cells were grown to mid-log phase from an overnight non-inducing culture, at which time protein expression was induced and the final OD 600 reached 3 to 5. To maximize protein yields per liter culture, we tested methods in which protein expression is performed at higher cell densities (Method 3 and 4 in Fig. <ref type="figure">2</ref>) <ref type="bibr">(37)</ref>. We first tested a strategy in which cells were grown to stationary phase in ZY-NIM media overnight, resuspended in a Celtone supplemented minimal media, grown for a short period, and then induced manually at an OD 600 5 to 10 (Method 3 in Fig. <ref type="figure">2</ref>). While the final OD 600 at harvest was substantially higher than low-density expressions (Methods 1 and 2), total protein culture fluorescence was only modestly improved (Method 3, Fig. <ref type="figure">4B</ref>), indicating protein production per cell was compromised, perhaps because the cells had not recovered from being in stationary phase when induction began. Alternatively, we inoculated fresh ZY-NIM media with the overnight culture, allowed it to grow to mid/late-log phase (OD 600 3-4) at which time the freshly grown cells were pelleted and resuspended in a Celtone supplemented minimal media and grown for a short period prior to IPTG induction (Method 4 in Fig. <ref type="figure">2</ref>, adapted from <ref type="bibr">(37)</ref>). The final cell density reached 8 to 12 like Method 3, however, overall protein production was &gt;3-fold improved compared to the low-density cultures (Fig. <ref type="figure">4B</ref>). The fluorescence values of these cultures correspond to approximately 8 to 10 mg of sfGFP-pSer150 per liter culture. Accurate incorporation of pSer was confirmed by Phos-tag gels and whole protein mass spectrometry (Fig. <ref type="figure">4</ref>, C and D). Thus, we selected Method 4 for future isotopically labeled expressions.</p><p>Production of 15 N sfGFP-pSer150 and 1 H-15 N HSQC spectra Using Method 4, we produced 15 N-labeled, &gt;90% phosphorylated sfGFP at site N150, as well as WT sfGFP, to confirm uniform isotopic enrichment and subsequent utility for NMR  analysis (Fig. <ref type="figure">5A</ref>). Yields of purified sfGFP-150pSer were about half that of WT sfGFP (data not shown). Phos-tag gel electrophoresis confirmed &gt;90% pSer incorporation before and after heteronuclear single quantum coherence (HSQC) data collection at 42 C (Fig. <ref type="figure">5A</ref>) <ref type="bibr">(38,</ref><ref type="bibr">39)</ref>. Guided by previous backbone assignments of GFPuv and WT sfGFP <ref type="bibr">(38,</ref><ref type="bibr">39)</ref>, we mapped the peak corresponding to residue 150 in the spectrum (8.5 ppm/ 119.8 ppm), which overlaps with residue E111 (Fig. <ref type="figure">5B</ref>). Our sfGFP-150pSer transverse relaxation-optimized spectroscopy HSQC spectrum matches well with previously published WT sfGFP spectrum <ref type="bibr">(38)</ref> with the exception of several perturbed resonances corresponding to residues near site 150.</p><p>We did not observe a clear down-field shift of the site 150 resonance in response to phosphorylation, however, which would be expected if a hydrogen bond was formed between the phosphate group and amide backbone <ref type="bibr">(40)</ref>. But being located within a &#946;-sheet, the backbone amide of residue 150 is not expected to form such a hydrogen bond while also maintaining the native fold (41) (Fig. <ref type="figure">5C</ref>). Despite lacking a clear assignment of the pSer150 cross peak, Phos-tag gel electrophoresis confirmed &gt;90% phosphorylation both before and after HSQC data collection at 42 C (Fig. <ref type="figure">5A</ref>) <ref type="bibr">(38,</ref><ref type="bibr">39)</ref>.</p><p>Phosphorylated SARS-CoV-2 Ser/Arg-rich linker region As a biologically relevant example, we produced the Ser/ Arg-rich IDR of the SARS-CoV-2 Nucleocapsid (N) protein isotopically labeled and with pSer at site S188. This region of N (residues 175-247) connects its RNA-binding domain and dimerization domain, and hyperphosphorylation of this SR-Linker region is thought to facilitate the release of viral gRNA from N <ref type="bibr">(42)</ref><ref type="bibr">(43)</ref><ref type="bibr">(44)</ref>. The mechanisms of this process are not well understood, though residues S188 and S206 within the SR-Linker of N when phosphorylated serve as "priming" sites for subsequent poly-phosphorylation by Glycogen Synthase Kinase-3&#946;. Indeed, the N double mutant S188A/S206A renders the virus unable to replicate <ref type="bibr">(13)</ref>. Thus, the ability to produce site-specifically phosphorylated SR-Linker variants of N for NMR dynamics analysis could help elucidate this key step of SARS-CoV-2 life cycle.</p><p>We expressed the SR-Linker region of N as a fusion with cleavable tags at its N-and C-terminus to improve solubility (bdSUMO and TEV-sfGFP-His6, respectively) first using Method 4 in unlabeled media. Purification and removal of the bdSUMO and sfGFP-His6 tags yielded pure SR-Linker with 80 to 90% phosphorylation at site S188 as confirmed by Phos-tag electrophoresis and whole-protein mass spectrometry (Fig. <ref type="figure">6, B</ref> and<ref type="figure">C</ref>). Having confirmed production of SR-Linker in sufficient quality, we repeated the expression in an isotopically enriched culture medium. 1 H- 15 N HSQC spectra at 10 C of the purified proteins are characteristic of an IDR where signals cluster in the 8.0 to 8.5 ppm region due to poor dispersion in 1 H spectra and matched well with previously published spectrum of a similar SR-Linker N construct (Fig. <ref type="figure">6D</ref>) <ref type="bibr">(45)</ref>. In these spectra, a clear 1 H-15 N peak is observed at 9.1 ppm/119.6 ppm in the pSer188 protein sample that is not present in the WT sample, consistent with where a phosphorylated serine amide peak would be expected (Fig. <ref type="figure">6D</ref>) <ref type="bibr">(46,</ref><ref type="bibr">47)</ref>. To further confirm the assignment of the resonance at 9.1 ppm to pSer188, we collected a 3D-15 N-nuclear overhauser effect spectroscopy (NOESY)-HSQC (Fig. <ref type="figure">6E</ref>). We observed sequential backbone amide nuclear overhauser effects for residues &#177;2 from the pSer188 (Fig. <ref type="figure">6E</ref>). These residues could be assigned based on the side chain proton chemical shifts, which were consistent with the expected amino acid sequence  Methods to access isotope-enriched phosphorylated proteins 6E). Collectively, these data demonstrate the facile ability to generate sufficient quantities of a biologically relevant, phosphorylated IDR for NMR applications. Detailed structural and dynamics analyses of these proteins will be described in a subsequent article.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Discussion</head><p>Here, we have described an optimized protocol for the generation of site-specifically phosphorylated, isotopically enriched proteins suitable for NMR analysis. The adoption of GCE pSer expression system imposed several challenges, including how to overcome growth deficiencies of the serine auxotroph BL21(DE3) &#916;serB in minimal media, as well as enhanced phosphatase activity not typically seen when expressing phospho-proteins in standard rich media. Through rounds of optimization, we found three important parameters that facilitated homogenous phospho-protein production in labeling media: (i) Celtone as a key additive needed for healthy cell growth to overcome serine auxotrophy, (ii) Sep-tRNA-CUA v2 for reduced mis-acylation by endogenous synthetases to minimize mis-incorporation of natural amino acids, and (iii) protein expression at lower temperatures to minimize phosphatase activities. Subsequent adaption of high-density methods allowed us to produce &#8805;10 mg purified sfGFP per liter culture with &gt;90% phosphorylation, costing approximately $300 per liter of uniformly 15 N-labeled proteins.</p><p>Although not demonstrated it here, this method can be adapted for 13 C labeling as well by using 13 C-labeled Celtone and glucose.</p><p>The ability to produce isotopically labeled phospho-proteins with GCE has been reported previously, but only in three instances to date. Two of these reports used the BL21(DE3) &#916;serB strain but their methods did not describe a mechanism to overcome serine auxotrophy, and they also added unlabeled pSer amino acid to the culture media so that the expressed protein presumably has a mixture of labeled and unlabeled pSer <ref type="bibr">(48,</ref><ref type="bibr">49)</ref>. In a third instance, recent work by Scheffner et al. <ref type="bibr">(50)</ref> circumvented serine auxotrophy by using BL21(DE3) with intact SerB, and so the expression required the addition of high concentrations of unlabeled pSer to drive incorporation and downstream work benefited from the ability to purify away unphosphorylated populations by anion exchange chromatography, which may not always be possible. In the methodologies reported here, all pSer incorporated into the protein is biosynthesized from isotopically enriched media components, and thus, all pSer residues are expected to be isotopically labeled and will be visible in NMR spectra as demonstrated with 15 N-HSQC data of phosphorylated SR Linker at site 188 (Fig. <ref type="figure">6D</ref>).</p><p>The methods described here were optimized to express phosphorylated sfGFP and SR-Linker proteins, but we anticipate different proteins will require additional adjustments in expression protocols for specific applications. We found Celtone used at relatively low concentrations (0.2% (w/v)) to be an economically viable supplement to support phospho-protein expression; however, other commercially available isotopically labeled media additives such as BioExpress (Cambridge Isotope Laboratories, Inc), SILEX Media (Silantes), and ISO-GRO (Sigma-Aldrich) may also be tractable. We believe that multi-site pSer incorporation should be possible, though feasibility will depend on the protein of interest and sites of incorporation. Adoption of other &#916;serB strains of E. coli, such as the RF1-deficient B95(DE3) &#916;A &#916;fabR &#916;serB (28), may prove helpful in this regard. However, our attempts to express sfGFP-150pSer in this strain using the methods reported here resulted in undesirable quantities of near-cognate suppression at the intended site of phosphorylation (data not shown), and so additional optimizations will be required for expressing isotopically labeled pSer proteins in this "truncation free" expression host. Nevertheless, the work here provides an important framework by which isotopically labeled, sitespecific pSer-containing proteins can be expressed efficiently in E. coli and opens the door to more routine analyses of phosphorylated proteins with two-dimensional and threedimensional NMR experiments.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Experimental procedures</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Strains and plasmids</head><p>The BL21(DE3) &#916;serB strain was a gift from Jesse Rinehart (Addgene #34929). BL21(DE3) and DH10b strains of E. coli were purchased from ThermoFisher Scientific. The pRBC-sfGFP WT and pRBC-sfGFP-150TAG plasmids were as previously described (Addgene #174075 and 174076, respectively) (28) (Fig. <ref type="figure">S2</ref>). The pKW2-EFSep was a generous gift from Jason Chin (Addgene # 173897) (Fig. <ref type="figure">S2</ref>). Genes for bdSUMO, bdSENP1, Sep-tRNA v2 , and SARS-CoV-2 SR-Linker were synthesized by Integrated DNA Technologies. The SARS-CoV-2 SR-Linker protein (residues 175-24) expression plasmids were made by fusing a bdSUMO fusion protein (lacking the first 19 residues) at its N-terminus (51) and a TEV cleavable sfGFP-His 6 at its C-terminus. All cloning steps were performed by SLiCE <ref type="bibr">(52)</ref>. The PPY strain used to generate SLiCE cloning extract was a gift from Yongwei Zhang (Albert Einstein College of Medicine) <ref type="bibr">(52)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Molecular biology reagents</head><p>Oligonucleotide primers were synthesized by Integrated DNA Technologies. Molecular biology reagents including restriction enzymes, T4 ligase, and polymerases were purchased either from Thermo Fisher Scientific or New England Biolabs. DNA Miniprep, Midiprep, PCR cleanup, and gel extraction kits were purchased from Machery Nagel. L-Serine, Celtone base powder (#1030P-U and 1030-N for unlabeled and 15 N labeled, respectively), and 100X MEM Vitamin were purchased from Sigma-Aldrich, Cambridge Isotope Laboratories, Inc, and Thermo Fisher Scientific, respectively. Phos-tag Acrylamide for gel electrophoresis was purchased from NARD Institute, Ltd.</p><p>Cell growth assessment BL21(DE3) and BL21(DE3) &#916;serB cells were streaked on LB/agar without antibiotics and grown overnight at 37 C. A single colony was used to inoculate a buffered, glucose-rich ZY-NIM (Table <ref type="table">1</ref>) overnight with shaking at 250 rpm. The overnight starter was diluted to a starting optical density (OD 600 ) of 0.15 into 50 ml of fresh minimal manual-inducing media (NIM-1) containing no additives, 2 mM serine, 0.2% Celtone, or both. Celtone was prepared by resuspending base powder with sterile water to a final concentration of 0.2% (w/ v). Cells were grown for 24 h with OD 600 measurements taken every hour for 9 h and once after 24 h. Cultures were grown in duplicate.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Methods to access isotope-enriched phosphorylated proteins</head><p>Protein expression in minimal media Fresh transformations were performed for all expressions in this study. BL21(DE3) &#916;serB cells were cotransformed with either pKW2-EFSep containing either Sep-tRNA CUA (B4) or Sep-tRNA CUA v2 and the appropriate pRBC plasmid. Approximately, a dozen colonies were used to inoculate overnight ZY-NIM (Table <ref type="table">1</ref>) and grown at 37 C. All ZY-NIM cultures contained 100 &#956;g/ml ampicillin and 25 &#956;g/ml chloramphenicol, while all minimal cultures contained 50 &#956;g/ml ampicillin and 15 &#956;g/ml chloramphenicol. Where indicated, L-serine was added at 2 mM final concentration and Celtone at 0.2% (w/v). To produce 15 N-labeled proteins, NH 4 Cl was replaced with 15 NH 4 Cl and Celtone replaced with 15 N-Celtone.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Method 1: Minimal auto-induction media</head><p>Overnight ZY-NIM cells (OD 600 5-8) were pelleted by centrifugation at 5000 rcf and then resuspended into minimal AIM (Table <ref type="table">2</ref>) (starting OD 600 0.1-2). Cultures were grown at either 37 C or 18 C with shaking at 250 rpm in baffled flasks and harvested 24 or 48 h later, respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Method 2: Minimal manual induction media, low density</head><p>Overnight ZY-NIM cells (OD 600 5-8) were pelleted by centrifugation at 5000 rcf and then resuspended minimal manual induction media 1 (MIM-1, Table <ref type="table">2</ref>) (starting OD 600 0.1-0.2). Cultures were grown at 37 C until OD 600 reached 0.6 to 0.8 and then induced with 1 mM IPTG. Cultures were grown at either 37 C or 18 C and harvested 24 or 48 h after IPTG addition, respectively. Method 3: Minimal MIM, high density ZY-NIM overnight starters (OD 600 5-8) were pelleted by centrifugation at 5000 rcf and resuspended into an equal volume of minimal MIM 2 (Table <ref type="table">2</ref>) (starting OD 600 5-8).</p><p>Cultures were grown at 37 C in baffled flasks until the OD 600 increased by 1 to 2 units (1-2 h) and then induced with 1 mM IPTG. Cultures were grown at 18 C for 48 h after IPTG addition.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Method 4: Minimal MIM, high density, with freshly grown cells</head><p>Cells from a ZY-NIM overnight starter culture (OD 600 5-8) were used to inoculate a fresh ZY-NIM culture (10% inoculum, e.g., 5 ml into 50 ml fresh ZY-NIM). Cultures were grown at 37 C with shaking at 250 rpm in baffled flasks until OD 600 reached 3 to 4, at which point cells were pelleted by centrifugation at 5000 rcf and resuspended into an equal volume of minimal MIM 2 (Table <ref type="table">2</ref>) (starting OD 600 3-4). Cultures were grown at 37 C in baffled flasks until the OD 600 increased by 1 to 2 units (1-2 h) and then induced with 1 mM IPTG. Cultures were grown at 18 C for 48 h after IPTG addition.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Protein purification sfGFP purification</head><p>Cell pellets containing sfGFP proteins were resuspended in Lysis Buffer (50 mM Tris pH 7.5, 500 mM NaCl, 5 mM imidazole, and phosphatase inhibitors: 50 mM sodium fluoride, 5 mM sodium pyrophosphate, 1 mM sodium orthovanadate) and lysed by microfluidization. Soluble cell lysate was obtained by centrifugation at 28,000 rcf for 45 min, to which TALON metal affinity resin was added. His 6 -tagged protein was allowed to bind to the TALON resin for 30 to 60 min with gentle rocking. Resin was collected and extensively washed with Lysis Buffer, and then protein was eluted with Lysis Buffer supplemented with 300 mM imidazole. After elution, proteins were further purified by gel-filtration on a 10/300 Superdex S75 column (Cytiva Life Sciences) in NMR buffer (30 mM sodium phosphate (pH 6.8), 100 mM NaCl) and then concentrated to 600 &#956;M using a 10,000 Da cutoff filter prior to NMR analysis. SDS-PAGE and Phos-tag gels were poured immediately before use and run according to manufacturer recommendation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>SR-Linker purification</head><p>The SR-Linker of N was genetically fused to a bdSUMO cleavable N-terminal tag and a TEV cleavable C-terminal sfGFP protein for enhanced solubility (pRBC-bdSUMO-SR-Linker-sfGFP-His6). Cell pellets containing SR-Linker were resuspended in Lysis Buffer (50 mM Tris pH 7.5, 500 mM NaCl, 5 mM imidazole, and phosphatase inhibitors: 50 mM sodium fluoride, 5 mM sodium pyrophosphate, 1 mM sodium orthovanadate) and lysed by microfluidization. Soluble cell lysate was obtained by centrifugation at 28,000 rcf for 45 min, to which TALON metal affinity resin and 50 nM untagged bdSENP1 protease was added. The SR-sfGFP-His 6 protein was allowed to bind to the TALON resin as the bdSENP1 protease cleaved the bdSUMO tag. Resin was collected and extensively washed with Lysis Buffer, and then SR-sfGFP-His 6 was eluted with Lysis Buffer supplemented with and 300 mM imidazole. Purified protein was buffer exchanged into 50 mM Tris, 350 mM NaCl with phosphatase inhibitors using PD-10 desalting columns (Cytiva Life Sciences). The sfGFP-His 6 tag was cleaved by TEV protease (1:20 TEV to SR) overnight at 4 C, and the mixture was flowed through fresh TALON resin. The flow-through fraction contained SR-Linker protein while the TEV protease and sfGFP-His 6 tag bound to the resin. For NMR analysis, SR-Linker protein was dialyzed overnight at 4 C in 50 mM sodium phosphate, 150 mM NaCl, pH 6.5 buffer (without phosphatase inhibitors) and using a 3000 Da cutoff filter and concentrated to 100 &#956;M prior to analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Quantification of sfGFP expression in cultures</head><p>Yield of sfGFP expressed per liter culture was calculated by measuring in-cell fluorescence of sfGFP and subtracting the contribution of cell auto-fluorescence (measured from the same density of cells not expressing any sfGFP construct). Fluorescence values were converted to mass of sfGFP per liter culture based on a standard curve of purified sfGFP. All values reported are the average of at least two independent replicate cultures, and error bars represent SDs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Whole protein mass spectrometry</head><p>Purified sfGFP proteins were exchanged into LC-MS grade water with PD-10 desalting columns. The SR-Linker proteins were buffer exchanged into 200 mM ammonium acetate by repeated concentration and dilutions using a 3000 Da cut-off centrifugal filter units. Mass spectra were obtained with a Waters Synapt G2 Mass Spectrometer at the Mass Spectrometry Facility at Oregon State University. The deconvoluted masses were obtained by using Waters MassLynx MaxEnt1 software.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>NMR analysis</head><p>NMR experiments were carried out on an 800-MHz Bruker Advance III HD NMR spectrometer equipped with a 5-mm triple resonance (HCN) cryogenic probe. Data collection for 15 N-sfGFP proteins was carried out at 42 C in 30 mm sodium phosphate, 100 mm NaCl (pH 6.8) buffer at a final concentration of 0.3 mM (38), while for 15 N-SR Linker proteins, data were collected at 10 C in 50 mM sodium phosphate, 150 mM NaCl, pH 6.5 buffer (53) at a final concentration of approximately 80 to 100 &#956;M. All samples contained 10% D 2 O, 1 mM sodium azide, protease inhibitor mixture (Roche Applied Science), and 0.2 mm 2-2 dimethylsilapentane-5-sulfonic acid for 1 H chemical shift referencing. All two-dimensional spectra were processed using NMRPipe (54) and visualized with NMRViewJ <ref type="bibr">(55)</ref>. To confirm the assignments of the pSer in the SR-Linker sample, we also collected a 3D-15 N-NOESY-HSQC. The NOESY data were collected with a mixing time of 120 ms, 256 complex points in each indirect dimension, and 35% random nonuniform sampling. The three-dimensional data were processed using NMRPipe <ref type="bibr">(54)</ref>, SMILE <ref type="bibr">(56)</ref>, and visualized using NMRViewJ <ref type="bibr">(55)</ref>.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>J. Biol. Chem. (2022) 298(12) 102613 1 &#169; 2022 THE AUTHORS. Published by Elsevier Inc on behalf of American Society for Biochemistry and Molecular Biology. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_1"><p>J. Biol. Chem. (2022) 298(12) 102613 5</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_2"><p>J. Biol. Chem. (2022) 298(12) 102613</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_3"><p>J. Biol. Chem. (2022) 298(12) 102613 7</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_4"><p>J. Biol. Chem. (2022) 298(12) 102613 9</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_5"><p>J. Biol. Chem. (2022) 298(12) 102613 11</p></note>
		</body>
		</text>
</TEI>
