<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>A Conserved Long Intergenic Non-coding RNA Containing snoRNA Sequences, lncCOBRA1, Affects Arabidopsis Germination and Development</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>05/25/2022</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10336843</idno>
					<idno type="doi">10.3389/fpls.2022.906603</idno>
					<title level='j'>Frontiers in Plant Science</title>
<idno>1664-462X</idno>
<biblScope unit="volume">13</biblScope>
<biblScope unit="issue"></biblScope>					

					<author>Marianne C. Kramer</author><author>Hee Jong Kim</author><author>Kyle R. Palos</author><author>Benjamin A. Garcia</author><author>Eric Lyons</author><author>Mark A. Beilstein</author><author>Andrew D. Nelson</author><author>Brian D. Gregory</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Long non-coding RNAs (lncRNAs) are an increasingly studied group of non-protein coding transcripts with a wide variety of molecular functions gaining attention for their roles in numerous biological processes. Nearly 6,000 lncRNAs have been identified in              Arabidopsis thaliana              but many have yet to be studied. Here, we examine a class of previously uncharacterized lncRNAs termed              CONSERVED IN              BRASSICA RAPA (              lncCOBRA              ) transcripts that were previously identified for their high level of sequence conservation in the related crop species              Brassica rapa              , their nuclear-localization and protein-bound nature. In particular, we focus on              lncCOBRA1              and demonstrate that its abundance is highly tissue and developmental specific, with particularly high levels early in germination.              lncCOBRA1              contains two snoRNAs domains within it, making it the first sno-lincRNA example in a non-mammalian system. However, we find that it is processed differently than its mammalian counterparts. We further show that plants lacking              lncCOBRA1              display patterns of delayed germination and are overall smaller than wild-type plants. Lastly, we identify the proteins that interact with              lncCOBRA1              and propose a novel mechanism of lincRNA action in which it may act as a scaffold with the RACK1A protein to regulate germination and development, possibly through a role in ribosome biogenesis.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>INTRODUCTION</head><p>Long non-coding RNAs (lncRNAs) are transcripts defined as greater than 200 nucleotides (nt) that lack or have an open reading frame less than 100 amino acids <ref type="bibr">(Liu et al., 2012)</ref>. Transcriptome-wide studies have demonstrated that lncRNAs are often expressed in a context-specific manner, a characteristic believed to facilitate some of their known functions in modulating gene expression, mRNA splicing, and translation <ref type="bibr">(Quinn and Chang, 2016)</ref>. The function of lncRNAs is highly dependent on their subcellular location. Nuclear lncRNAs often serve key roles in regulating gene expression, either in cis (where the lncRNA interacts with neighboring genes to regulate their expression) or in trans (where the lncRNA interacts with distant genes to regulate their expression). lncRNAs can also bind and sequester proteins, such as proteins involved in chromatin stability and splicing factors, from their target chromosomal regions, thereby affecting gene expression <ref type="bibr">(Yin et al., 2012;</ref><ref type="bibr">Lee et al., 2016)</ref>.</p><p>In plants, lncRNAs are implicated in numerous biological mechanisms with demonstrated functions in flowering, organogenesis, photomorphogenesis, reproduction, and abiotic/biotic stress responses (reviewed in <ref type="bibr">Wang and Chekanova, 2017)</ref>. Most research has focused on the intergenic class of lncRNAs (lincRNAs) <ref type="bibr">(Mattick and Rinn, 2015)</ref>, as historically it has been easier to discern their transcriptional origins relative to other lncRNAs that overlap protein-coding genes [e.g., natural antisense transcripts (NATs)]. In plants, detailed annotation and functional efforts have led to the identification of several lincRNAs with characterized functions in regulation of auxin signaling outputs <ref type="bibr">(Ariel et al., 2014)</ref>, response to abiotic and biotic stressors <ref type="bibr">(Wang et al., 2014;</ref><ref type="bibr">Qin et al., 2017;</ref><ref type="bibr">Seo et al., 2017)</ref>, flower timing <ref type="bibr">(Swiezewski et al., 2009)</ref>, and response to phosphate starvation <ref type="bibr">(Franco-Zorrilla et al., 2007)</ref>.</p><p>While most Pol II transcribed lincRNAs are 5 capped and 3 polyadenylated, recently a previously uncharacterized group of lncRNAs that lacks one or both of these features has been described <ref type="bibr">(Xing and Chen, 2018)</ref>. These non-canonical Pol IIdependent lncRNAs have snoRNA sequences at their 5 and 3 ends and are referred to as sno-lncRNAs. snoRNAs are 70-200 nt highly structured, nuclear-localized, protein-bound noncoding RNAs that are usually concentrated in the Cajal bodies or nucleolus <ref type="bibr">(Reichow et al., 2007)</ref>. snoRNAs co-transcriptionally form snoRNA-ribonucleoprotein complexes (snoRNPs) <ref type="bibr">(Kiss, 2001)</ref> and function through complementarity with ribosomal RNA (rRNA) sequences to guide rRNA modification to ultimately participate in ribosome subunit maturation. The formation of snoRNPs at the ends of sno-lncRNAs protects the intervening sequence from exonuclease trimming <ref type="bibr">(Yin et al., 2012)</ref>.</p><p>sno-lncRNAs have been identified in humans, rhesus monkeys, and mice <ref type="bibr">(Yin et al., 2012;</ref><ref type="bibr">Zhang et al., 2014;</ref><ref type="bibr">Xing et al., 2017)</ref> but have yet to be described in plants. A functional analysis of sno-lncRNAs in humans was recently performed, where SLERT was identified (snoRNA-ended lncRNA enhances pre-ribosomal RNA transcription; <ref type="bibr">Xing et al., 2017)</ref>. SLERT localizes to the nucleolus in a manner dependent on the two snoRNPs at its ends and functions to promote active transcription of rRNAs <ref type="bibr">(Xing et al., 2017)</ref>. Thus, sno-lncRNAs represent an interesting class of lncRNAs with evident functions in humans.</p><p>Due to their lack of protein-coding capacity, lincRNAs typically display poor sequence conservation among even closely related species <ref type="bibr">(Necsulea et al., 2014;</ref><ref type="bibr">Nelson et al., 2016)</ref>. lincRNAs with functions defined by structural or sequencespecific interactions with other molecules (e.g., proteins) will likely display higher levels of conservation over lincRNAs that function based on proximity to other genes (e.g., transcription enhancers/repressors). We previously identified lincRNAs in the nuclei from 10-day-old seedlings and found that lincRNAs with RNA binding protein (RBP) binding sites were significantly more likely to be conserved at the sequence-level in Brassica rapa than those that lacked protein binding sites <ref type="bibr">(Gosai et al., 2015)</ref>, suggesting these protein-bound, conserved lincRNAs may be of functional importance in plants.</p><p>Here, we assess the function of those nuclear, protein-bound, and conserved lincRNAs that we have termed CONSERVED <ref type="bibr">IN BRASSICA RAPA (lncCOBRA)</ref>. We find that the COBRA lincRNAs display germination-and developmental-dependent patterns of abundance and, in particular, we focus on lncCOBRA1 which contains two snoRNA sequences within it, indicating the first evidence of a sno-lincRNA in Arabidopsis. Unlike sno-lncRNAs identified in humans, lncCOBRA1 is transcribed from an intergenic region, and is transcribed as a longer transcript before processing at its 3 end to a final length of &#8764;500-600 nt. We further show that lncCOBRA1 influences plant germination and growth, as plants lacking lncCOBRA1 germinate later and are smaller than wild type plants. Lastly, we identify lncCOBRA1interacting proteins, including the scaffold protein RACK1A, and several of its known interactors and hypothesize that lncCOBRA1 functions with RACK1A to affect ribosome biogenesis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RESULTS</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Identification of Conserved, Nuclear, Protein-Bound Long Intergenic Non-coding RNAs</head><p>We previously identified 236 nuclear lincRNAs from 10-day-old seedlings, of which 38 contained up to four RNA-binding protein (RBP) interaction sites <ref type="bibr">(Gosai et al., 2015)</ref>. These protein-bound lincRNAs were significantly more conserved within the related crop species Brassica rapa than those lacking RBP binding sites (Supplementary Figure <ref type="figure">1A</ref> and Table <ref type="table">1</ref>; <ref type="bibr">Gosai et al., 2015)</ref>. Since lincRNAs do not encode proteins, small polymorphisms within the sequence generally have little functional consequence, and thus lincRNAs are generally not well conserved at the sequence level <ref type="bibr">(Ponjavic et al., 2007;</ref><ref type="bibr">Necsulea et al., 2014;</ref><ref type="bibr">Hezroni et al., 2015)</ref>. Thus, the combination of conservation in Brassica rapa and nuclear protein binding suggested that these lincRNAs may have important functions in plant systems and were named CONSERVED IN BRASSICA RAPA 1-14 (lncCOBRA1-14) (Supplementary Figure <ref type="figure">1A</ref> and Table <ref type="table">1</ref>).</p><p>We selected a set of these lncCOBRAs and initiated our search for function by examining their abundance profiles during seed germination, as lincRNAs in several eukaryotic species are essential during development (e.g., HOTAIR, COOLAIR) <ref type="bibr">(Rinn et al., 2007;</ref><ref type="bibr">Swiezewski et al., 2009;</ref><ref type="bibr">Liu et al., 2012;</ref><ref type="bibr">Sarropoulos et al., 2019)</ref>. Using a previously published transcriptomic dataset <ref type="bibr">(Narsai et al., 2017)</ref>, we found that the majority (N = 9; 64%) of lncCOBRA transcripts displayed germination-dependent patterns of abundance, with peaks in abundance at various points during seed germination (Supplementary Figure <ref type="figure">1B</ref>). Going forward, we focused on lncCOBRA1, lncCOBRA3, and lncCOBRA5 due to their highly specific abundance profiles during seed germination and the availability of insertional mutant lines for these loci. lncCOBRA1 and lncCOBRA3 were most abundant after 48 h of stratification at 4 &#8226; C in the dark  <ref type="bibr">Gosai et al. (2015)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Liu et al., 2012</head><p>Araport11 COBRA ID followed by 1 h in light, while lncCOBRA5 abundance was highest slightly later, with a peak in abundance 6 h after transfer into light conditions (Figure <ref type="figure">1A</ref> and Supplementary Figure <ref type="figure">1B</ref>). Abundance of the three lncCOBRA transcripts decreased rapidly as the seeds progressed through germination and transitioned into seedlings (Figure <ref type="figure">1A</ref> and Supplementary Figure <ref type="figure">1B</ref>). Supporting this, the Arabidopsis expression atlas in the eFP Browser <ref type="bibr">(Klepikova et al., 2016)</ref> revealed that all three lncCOBRA transcripts were expressed early during seed germination, with the highest expression at 1 day after imbibition (Supplementary Figure <ref type="figure">1C</ref>). The abundance of lncCOBRA1, lncCOBRA3, and lncCOBRA5 was also dynamic throughout seedling development as measured by quantitative reverse transcription-PCR (qPCR), as they had the highest abundance in 2-day-old seedlings and rapidly decreased in abundance as the seedlings aged (Figure <ref type="figure">1B</ref> and Supplementary Figure <ref type="figure">1D</ref>). Additionally, lncCOBRA transcripts displayed tissue specific patterns of accumulation. For instance, we found that lncCOBRA5 abundance is highest in leaf tissue and increases in abundance as the age of the leaf progressed from embryonic cotyledons to juvenile leaves and adult leaves, while lncCOBRA3 demonstrated similar levels of abundance in all tissues profiled (Figure <ref type="figure">1C</ref>). In contrast, lncCOBRA1 had the highest abundance in 5-day-old seedlings, specifically in the cotyledons, and decreased as the leaves increased in age, with a significant (pvalue &lt; 0.001; Wilcoxon t-test) decrease in abundance between 5-day-old cotyledons and true leaves (both juvenile and adult leaves) (Figure <ref type="figure">1C</ref> and Supplementary Figure <ref type="figure">1E</ref>). Thus, all three lncCOBRA transcripts examined were highly abundant early in germination and decreased as development progressed. In particular, lncCOBRA1 was highly abundant in embryonic cotyledons and decreased in abundance as true leaves emerge, suggesting lncCOBRA1 may function during germination and/or early in plant development.</p><p>Since these lincRNAs were originally identified as nuclear lincRNAs, and lincRNA function is influenced by subcellular localization, we sought to determine if they were solely nuclear retained. To do so, we isolated pure nuclear and cytoplasmic fractions using the isolation of nuclei tagged in specific cell types (INTACT) technique <ref type="bibr">(Deal and</ref><ref type="bibr">Henikoff, 2010, 2011)</ref> and performed qPCR for lncCOBRA1, lncCOBRA3, and lncCOBRA5 as well as nuclear (U6) and cytoplasmic (5.8S rRNA and 18S rRNA) positive controls. All three lncCOBRA transcripts were significantly (p-value &lt; 0.001; Wilcoxon t-test) enriched in the nuclear fraction like U6 but not in the cytoplasmic fraction where the two rRNAs were enriched, confirming these transcripts were indeed primarily nuclear localized (Figure <ref type="figure">1D</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>lncCOBRA1 Contains Two Highly Conserved snoRNA Domains and Is Processed at Its 3 End After Transcription</head><p>As both lncCOBRA1 and lncCOBRA3 contain small nucleolar RNA (snoRNA) sequences annotated within their transcripts (Figure <ref type="figure">1E</ref> and Supplementary Figure <ref type="figure">1F</ref>), and given the evident importance of sno-lncRNAs in humans <ref type="bibr">(Xing and Chen, 2018)</ref>,  <ref type="bibr">2019)</ref>. Protein-binding sites were identified in the nuclei from 10-day-old seedlings in <ref type="bibr">(Gosai et al., 2015)</ref>. Colors in identity: Green = 100%, green-brown = 30-100%, red &lt; 30% identity.</p><p>we were particularly interested in these two transcripts. Since lncCOBRA3 lacked tissue-specific patterns of abundance and lncCOBRA1 had distinct patterns of abundance during seed germination and development, we decided to focus on lncCOBRA1. lncCOBRA1 was annotated to be a 318 nt lincRNA in the Araport11 genome annotation and contained two snoRNA sequence domains within it. The two annotated snoRNA domains overlapped with two previously identified RBP binding sites (Figure <ref type="figure">1E</ref>; <ref type="bibr">Gosai et al., 2015)</ref>. These RNA binding/snoRNA domains displayed the highest level of sequence similarity in a sequence alignment of lncCOBRA1 homologs from five Brassicaceae with AT1G05917 (sno-COBRA1A) and AT1G05907 (sno-COBRA1B) having &#8764;79 and 56% sequence identity among the profiled species, respectively (Figure <ref type="figure">1E</ref>). lncCOBRA1 was highly conserved in all species profiled, with 30-46% sequence identity in the 500 nt up-and downstream of the 5 most snoRNA (AT1G05917), which also included sno-COBRA1B (Figure <ref type="figure">1E</ref> and Supplementary Figure <ref type="figure">2A</ref>; Geneious | Bioinformatics Solutions for the Analysis of Molecular Sequence Data, 2019). To ensure we are examining the lncCOBRA1 lincRNA rather than a functional set of snoRNAs, two primer sets were used for all qPCR analyses, one set within sno-COBRA1A and the other set (set 2) amplifying the region between the two snoRNAs (Supplementary Figures <ref type="figure">1D,</ref><ref type="figure">E,</ref><ref type="figure">G</ref>; blue and red primers). In addition to their sequence conservation within Brassicaceae, sno-COBRA1A and sno-COBRA1B have sequence homology to human SNORD59A and SNORD59B, with sequence identity of 76 and 90%, respectively <ref type="bibr">(Liang-Hu et al., 2001</ref>; Supplementary Figure <ref type="figure">2B</ref>). In fact, their tandem orientation is also conserved in humans, with SNORD59A upstream of SNORD59B in an intron of the protein-coding transcript encoding ATP synthase subunit d (ATP5PD) <ref type="bibr">(Kiss-L&#225;szl&#243; et al., 1996)</ref>. Overall, these findings indicate that these snoRNA sequences and orientation are highly conserved, suggesting they are of significant evolutionary importance.</p><p>In humans, sno-lncRNAs are derived from introns excised from protein-coding mRNAs that contain two snoRNA sequences <ref type="bibr">(Xing and Chen, 2018)</ref>. Instead of being degraded like normal, these introns are debranched and trimmed at the 5 and 3 ends by exonucleases until the enzyme reaches the snoRNA domain. The highly structured and protein-bound nature of the snoRNA sequences acts as protection from further degradation, resulting in lncRNAs flanked by snoRNA sequences at each end, but that lack 5 caps and poly(A) tails <ref type="bibr">(Xing and Chen, 2018)</ref>. To determine if a similar mechanism was used during lncCOBRA1 biogenesis, we first performed 5 rapid amplification of cDNA ends (5 RACE) to determine the 5 end of the transcript. In 5 RACE, any present 5 caps are removed, and an adapter is directly ligated to the 5 end of RNA. Following reverse transcription with a gene specific primer and two rounds of PCR, the precise 5 end of the transcript can be determined (Figure <ref type="figure">2A</ref>). If the 5 end of lncCOBRA1 was as annotated in Araport 11, we would expect PCR products of 250 and 319 bp produced with a primer within the 5 adapter and two reverse primers, A and B, respectively (Figures <ref type="figure">2A,</ref><ref type="figure">B</ref>). Indeed, the 5 RACE PCR reactions produced products as expected, indicating that the annotated 5 end of lncCOBRA1 is indeed where the transcript begins (Figures <ref type="figure">2A,</ref><ref type="figure">B</ref>; 5 RACE results indicated by red triangle), and thus lncCOBRA1 is apparently not trimmed at the 5 end after transcription.</p><p>We next asked if there was 3 end processing and sought to determine the full length of lncCOBRA1. To begin, we performed RT-PCR with a forward primer at the 5 most end of the transcribed RNA as confirmed by 5 RACE and five tiled reverse primers (Supplementary Figure <ref type="figure">3A</ref>, green arrows). This revealed that lncCOBRA1 was substantially longer than originally annotated, with amplification of lncCOBRA1 with all reverse primers, indicating that lncCOBRA1 is transcribed as a much longer transcript, possibly over 1000 nt long (Supplementary Figure <ref type="figure">3B</ref>). Given the tissue specificity of lncCOBRA1 abundance (Figure <ref type="figure">1C</ref>), we performed the RT-PCR in 2-, 3-, 4-, and 5-dayold seedlings as well as seeds 1-and 2-days-after-imbibition to determine if there were different isoforms in a developmental manner. This revealed amplification with all reverse primers in all developmental time points, revealing that lncCOBRA1 was over 1000 nt at these stages as well (Supplementary Figure <ref type="figure">3B</ref>). Overall, this suggests that lncCOBRA1 is a much longer lincRNA than initially hypothesized.</p><p>To determine the precise 3 end of lncCOBRA1, we performed 3 RACE. Similar to 5 RACE, an adapter is ligated to the 3 end followed by reverse transcription with a gene specific primer and two rounds of nested PCR (Figure <ref type="figure">2A</ref>). The final PCR reaction produced a diffuse band around 500-650 bp in length, which would suggest a 742-892 nt long transcript based on the site of the 3' RACE primer (Figures 2A, blue arrow, C). Since the resulting 3 RACE PCR band was diffuse, we extracted the PCR product, cloned it into a sequencing vector and performed Sanger sequencing to identify the precise 3 end of lncCOBRA1. After sequencing 14 independent colonies, several 3 ends of lncCOBRA1 were revealed, with the majority of 3 ends centering &#8764;250 and &#8764;350 nt downstream of the 3 RACE primer (Figure <ref type="figure">2A</ref>; blue triangles). The various 3 ends detected by 3 RACE, the diffuse 3 RACE PCR band (Figure <ref type="figure">2C</ref>), and the RT-PCR results (Supplementary Figure <ref type="figure">3B</ref>) indicate that lncCOBRA1 is transcribed as a longer transcript, possibly over 1000 nt in length (Supplementary Figure <ref type="figure">3B</ref>), and is trimmed from its 3 end to reach a final transcript &#8764;500-600 nt long, possibly with several stable 3 ends. Importantly, in all of the 14 colonies sequenced, no polyA tail was identified. This, along with our inability to detect lncCOBRA1 in any published polyA-selected RNA-seq datasets (data not shown) suggests that lncCOBRA1 is not polyadenylated in its final processed form.</p><p>In plants, polycistronic snoRNAs are encoded in intergenic regions, transcribed by RNA Pol II and generally contain two conserved promoter elements, a Telo-box and a Site II element (combined referred to as TeloSII) <ref type="bibr">(Gaspin et al., 2010)</ref>. Notably, in Arabidopsis nearly all ribosomal protein genes and other genes involved in ribosome biogenesis and translation contain TeloSII elements in their promoters <ref type="bibr">(Gaspin et al., 2010)</ref>. This combined TeloSII element is found upstream of the TATA box and acts to coordinate expression of snoRNAs and protein-coding genes implicated in ribosome biogenesis <ref type="bibr">(Qu et al., 2015)</ref>. Interestingly, the lncCOBRA1 promoter contained both a Telo-box and two Site II elements upstream of a TATA-box in the lncCOBRA1 promoter, suggesting it is regulated in a similar manner to canonical snoRNAs and may be coordinated with genes related to ribosome biogenesis (Supplementary Figure <ref type="figure">3C</ref>). In addition, the promoter contained a conserved non-coding sequence (CNS) <ref type="bibr">(Velde et al., 2014)</ref>, which are shown to be highly associated with genes encoding transcription factors and developmental genes and are enriched for transcription factor binding sites <ref type="bibr">(Burgess and Freeling, 2014)</ref>. The presence of a CNS further emphasizes the conservation of the lncCOBRA1 gene locus (Supplementary Figure <ref type="figure">3C</ref>). Overall, lncCOBRA1 is a highly conserved lincRNA that is trimmed at its 3 end post-transcriptionally to generate a &#8764;500-600 nt lincRNA.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Loss of lncCOBRA1 Results in Delayed Germination and Smaller Plants</head><p>To examine the function of lncCOBRA1, we obtained a T-DNA insertion line (lnccobra1-1; SALK_086689) from the Arabidopsis Biological Resource Center with an insertion upstream of sno-COBRA1A and generated a complete lncCOBRA1 null (lnccobra1-2) using CRISPR gene editing (Figure <ref type="figure">3A</ref>). PCR and Sanger sequencing confirmed that the CRISPR guide RNAs caused a large deletion of 1325 bp (Supplementary Figures <ref type="figure">4A,</ref><ref type="figure">B</ref>). This larger than expected deletion was likely a product of double strand break repair <ref type="bibr">(Korablev et al., 2020)</ref> and importantly did not disrupt the surrounding genes.</p><p>lnccobra1-1 had significantly (p-value &lt; 0.001; Wilcoxon t-test) depleted levels of lncCOBRA1 as measured by qPCR and lnccobra1-2 levels were unmeasurable as it is a null mutant with the entire gene deleted (Figure <ref type="figure">3B</ref> and Supplementary Figure <ref type="figure">4C</ref>) while levels and processing of rRNAs were minimally affected (Supplementary Figures <ref type="figure">4D,</ref><ref type="figure">E</ref>). Furthermore, the T-DNA insertion and CRISPR deletion were specific for decreasing lncCOBRA1 as levels of the downstream proteincoding gene THO2 were mostly unaffected in either mutant line (Supplementary Figure <ref type="figure">4D</ref>). We did identify a slight but significant increase in 5.8S rRNA, 18S rRNA, and 25S rRNA levels, but no visible changes in rRNA processing in the mutants compared to Col-0 (Supplementary Figure <ref type="figure">4E</ref>). Thus, lncCOBRA1 likely does not influence rRNA processing even though it contains two well-conserved snoRNA domains.</p><p>We also complemented the lnccobra1-1 background by introducing the entire genomic region between the two neighboring genes into this genetic background (lncCOBRA1pro:lncCOBRA1/lnccobra1-1; hereafter lncCOBRA1/lnccobra1-1). lncCOBRA1 complementation resulted in a significant increase in lncCOBRA1 levels (Figure <ref type="figure">3B</ref> and Supplementary Figure <ref type="figure">4C</ref>). This overexpression of lncCOBRA1 eliminated the slight but significant increase in 5.8S rRNA, 18SrRNA, and 25S rRNA levels observed in the mutant alleles (p-value &gt; 0.05; Wilcoxon t-test), suggesting that the slight increase in abundance of these mature rRNAs may in fact be due to the loss of lncCOBRA1 (Supplementary Figure <ref type="figure">4D</ref>). In total, our findings indicate that both lnccobra1 mutant lines specifically and significantly decrease the levels of this lincRNA.</p><p>Given the high abundance of lncCOBRA1 during seed germination (Figure <ref type="figure">1</ref>), we examined the number of seeding with fully emerged cotyledons after 2 days in the growth chamber of Col-0, lnccobra1-1, lnccobra1-2, and lncCOBRA1/lnccobra1-1 seeds 48 h after sowing as a proxy for germination defects. We observed that significantly (p-value &lt; 0.001; Wilcoxon t-test) fewer lnccobra1-1 and lnccobra1-2 seeds germinated than in the Col-0 background, while significantly (p-value &lt; 0.01; Wilcoxon t-test) more lncCOBRA1/lnccobra1-1 seeds germinated at 48 h (Figure <ref type="figure">3C</ref>), suggesting that lncCOBRA1 levels affect seed germination.</p><p>The effects of lncCOBRA1 on germination persisted throughout vegetative growth, as 3-week-old lnccobra1-1 plants were slightly but significantly (&#8764;0.5 leaves; p-value &lt; 0.01; Wilcoxon t-test) delayed in leaf production compared to same aged Col-0 plants. This same trend was also observed in lnccobra1-2 plants, but not to a level of statistical significance (p-value &gt; 0.05; Wilcoxon t-test) (Figure <ref type="figure">3D</ref>). Increased levels of lncCOBRA1 in lncCOBRA1/lnccobra1-1 plants led to more leaves than Col-0 (&#8764;0.5 leaves, p-value &lt; 0.05; Wilcoxon t-test) (Figure <ref type="figure">3D</ref>), suggesting lncCOBRA1 is responsible for this phenotype. This change in number of leaves at 3-weeks after planting was not due to a change in the overall growth rate of the plants, as there is no change in rate of leaf initiation in lnccobra1-1, lnccobra1-2, or lncCOBRA1/lnccobra1-1 compared to Col-0 (Figure <ref type="figure">3E</ref>). lnccobra1-1 and lnccobra1-2 plants were also substantially smaller than Col-0 plants, while the plants overexpressing lncCOBRA1 (lncCOBRA1/lnccobra1-1) rescued this phenotype and resulted in plants that were slightly larger in both 3-and 5-week-old plants (Figures <ref type="figure">3F-H</ref> and Supplementary Figure <ref type="figure">5A</ref>). Aside from overall size of the plants, the individual rosette leaves were also smaller in the mutant plant lines (Supplementary Figure <ref type="figure">5B</ref>). Since altered lncCOBRA1 levels did not affect the rate of growth (Figure <ref type="figure">3E</ref>), it is possible that the smaller nature of lnccobra1-1 and lnccobra1-2 may be due to a change in either the number or size of leaf cells, though this needs to be probed further. Overall, levels of lncCOBRA1 effect seed germination, and these germination effects persist through vegetative growth, resulting in plants that are smaller or larger than Col-0 when lncCOBRA1 levels are decreased or increased, respectively. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>lncCOBRA1 Interacts With a Wide Variety of Proteins</head><p>To begin to understand the molecular function of lncCOBRA1, we set out to identify what proteins bind lncCOBRA1, as lncCOBRA1 was initially identified for having sites of RBP binding (Supplementary Figure <ref type="figure">1A</ref>) <ref type="bibr">(Gosai et al., 2015)</ref>. To do so, we performed chromatin isolation by RNA purification followed by mass spectrometry (ChIRP-MS) <ref type="bibr">(Chu et al., 2015)</ref>. In this technique, we incubated lysates from 5-day-old Col-0 and lnccobra1-2 seedlings with biotinylated probes antisense to lncCOBRA1 (Figure <ref type="figure">4A</ref>) or a scrambled sequence as a negative control. We then used streptavidin coated beads to pull down lncCOBRA1, isolated proteins bound and performed mass spectrometry. We confirmed the efficacy of the pulldown by qPCR and found lncCOBRA1 was significantly (p-value &lt; 0.001; Wilcoxon t-test) enriched with probes antisense to lncCOBRA1 compared to the scrambled sequence control probes, indicating that the lncCOBRA1 probes are highly specific (Figure <ref type="figure">4A</ref>). Importantly, enrichment of lncCOBRA1 with the experimental probes was significantly (p-value &lt; 0.001; Wilcoxon t-test) depleted when ChIRP was performed in lnccobra1-2 null seedlings (Figure <ref type="figure">4A</ref>). As lncCOBRA1 contains two snoRNA domains, we also asked whether lncCOBRA1 directly interacted with rRNAs and found that lncCOBRA1 probes did not enrich for 5.8S rRNA, 18S rRNA, or 25S rRNA relative to scrambled sequence control probes (Supplementary Figure <ref type="figure">6A</ref>). This indicated that lncCOBRA1 does not interact with rRNA, further confirming that the snoRNA domains within lncCOBRA1 do not function like canonical snoRNAs (Supplementary Figure <ref type="figure">6A</ref>).</p><p>After mass spectrometry, we set out to identify high confidence interactors from the &#8764;2200 proteins identified (Supplementary Data Set 1A). To do so, we required that proteins must be (1) identified in at least 2 biological replicates of the lncCOBRA1 pulldown in Col-0 plants (N = 469 proteins) (Supplementary Data Set 1B), (2) enriched with the lncCOBRA1 probes compared to scrambled sequence control probes (N = 206), and (3) enriched &gt; 2-fold in Col-0 compared to lnccobra1-2 seedlings (N = 74; Figure <ref type="figure">4B</ref>, red dots an Supplementary Figure <ref type="figure">6B</ref>). A total of 74 proteins were identified from these filtering steps. An additional 39 proteins were identified in at least 2 biological replicates in the lncCOBRA1 pulldown but absent from control pulldowns  (scrambled or lnccobra1-2 background; Table <ref type="table">2</ref>). In total, 113 proteins were identified as high-confidence lncCOBRA1interacting proteins, and specifically bound to lncCOBRA1 in 5-day-old Col-0 seedlings. lncCOBRA1-interacting proteins were significantly enriched for proteins with molecular function of RNA binding, and 37.5% (p-value &lt; 5.21 &#215; 10 -40 ; hypergeometric test) were demonstrated to bind to RNA in a recent study identifying the RNA binding proteome of Arabidopsis leaves <ref type="bibr">(Bach-Pages et al., 2020)</ref>, supporting the claim that these proteins interact directly with lncCOBRA1 (Figure <ref type="figure">4C</ref> and Supplementary Figure <ref type="figure">6C</ref>). Those proteins not demonstrated to have RNA binding capabilities may still interact with lncCOBRA1 indirectly. In addition, several proteins involved in transcription regulation were identified, including PUR ALPHA-1 (PUR&#945;), which has hypothesized roles in rRNA transcription (Table <ref type="table">3</ref>; <ref type="bibr">Tr&#233;mousaygue et al., 2003)</ref>.</p><p>lncCOBRA1-interacting proteins were involved in a wide-range of biological functions, including response to cytokinin and abscisic acid (ABA), gluconeogenesis, and photorespiration (Supplementary Figure <ref type="figure">6D</ref>; <ref type="bibr">Ran et al., 2020)</ref>. Additionally, lncCOBRA1-interacting proteins were enriched for proteins functioning in "structural constituents of the ribosome" and located in the cytoplasmic ribosome, chloroplasts, and the nucleolus (Figure <ref type="figure">4D</ref>). In fact, twelve of the lncCOBRA1-interacting proteins (10.6%; p-value &lt; 2.7 &#215; 10 -13 ; hypergeometric test; Table <ref type="table">4</ref>) were identified in a previous study identifying the nucleolar proteome <ref type="bibr">(Pendle et al., 2005)</ref>. The nucleolus is a non-membrane bound nuclear structure that is the site for ribosome assembly and maturation. Given the snoRNA domains in lncCOBRA1 and the identification of cytoplasmic ribosomal constituents bound to the nuclear localized lncCOBRA1, we hypothesize that lncCOBRA1 may be localized to the nucleolus. Among these RNA binding lncCOBRA1-interacting proteins is RNaseJ, which is the most enriched protein bound to lncCOBRA1 in Col-0 relative to lnccobra1-2 (Figure <ref type="figure">4B</ref>). RNaseJ is a metallo-beta-lactamase protein that possesses endo-and 5 -3 exonuclease activities in bacteria and chloroplasts within plants and is required for embryo and chloroplast development <ref type="bibr">(Halpert et al., 2019)</ref> with roles in rRNA maturation and 5 stability of mRNAs in bacteria <ref type="bibr">(Mathy et al., 2007)</ref>. This finding provides an additional connection between lncCOBRA1 and ribosome processing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>lncCOBRA1-Interacting Proteins Are Highly Interconnected</head><p>As proteins tend to act in complexes and lncCOBRA1-interacting proteins were enriched for proteins involved in protein binding (Figure <ref type="figure">4C</ref>), we next asked if there were known interactions among the 113 lncCOBRA1-interacting proteins (Figure <ref type="figure">4B</ref> and Table <ref type="table">2</ref>). Using STRING, we generated a protein-protein interaction (PPI) network which formed significantly (pvalue &lt; 1.0 &#215; 10 -16 ; STRING) more interactions than expected, indicating that lncCOBRA1-interacting proteins had more interactions among themselves than what would be expected for a random set of proteins of a similar size from the Arabidopsis proteome (Supplementary Figure <ref type="figure">7A</ref>; <ref type="bibr">Szklarczyk et al., 2019)</ref>. Using k-means clustering, the proteins within the network were further grouped into 5 clusters (green, cyan, blue, red, and yellow) (Supplementary Figures <ref type="figure">7A,</ref><ref type="figure">B</ref>). Each cluster represented distinct groups of proteins with cytokinin response-related and photosynthetic proteins, glycolytic proteins, and mRNA splicing-related proteins clustering together to form the green, cyan and blue clusters, respectively <ref type="bibr">(Huang et al., 2009)</ref>. Of the five clusters, blue, green, and cyan were interlaced throughout the network, and hard to distinguish between each other. The red cluster was the most spread out, lying on the periphery of the network with very little significant enrichment for biological processes or cellular compartments, indicating this cluster represents a variety of different proteins with a range of functions (Supplementary Figure <ref type="figure">7B</ref>). Within the red cluster lies the trihelix DNA binding transcription factor 6B-INTERACTING PROTEIN 1-LIKE (ASIL1) (Supplementary Figure <ref type="figure">7A</ref>), which was previously shown to be involved in repressing seed maturation genes during seed germination and seedling development <ref type="bibr">(Gao et al., 2009)</ref> and was also previously identified in the nucleolus (Table <ref type="table">2</ref>). Since numerous nuclear lincRNAs function in gene regulation by binding and directing transcription factors to the correct genomic loci, and ASIL1 regulates germination, which is misregulated in lnccobra1-1 and lnccobra1-2 plants, it is possible that lncCOBRA1 interacts with ASIL1 to affect seed maturation genes during seed germination and seedling development but further studies are required to test this.</p><p>A closer examination of the yellow cluster, which was the most compact group (Figure <ref type="figure">5A</ref>), revealed that this close network was enriched for proteins involved in ribosome biogenesis, rRNA processing, response to cytokinin, RNA binding, and constituents of the ribosome (Figures <ref type="figure">5B-D</ref> and Supplementary Figure <ref type="figure">7B</ref>). This cluster was also enriched for proteins localized in the nucleolus and ribosome (Figures <ref type="figure">5B-D</ref> and Supplementary Figure <ref type="figure">7B</ref>). A major node within the yellow cluster was RECEPTOR FOR ACTIVATED C KINASE 1A (RACK1A; encoded by ATARCA) (Figure <ref type="figure">5A</ref>). RACK1A is a major subunit of RACK1, which is a highly conserved scaffold protein present in all eukaryotic organisms studied, from Chlamydomonas to plants and humans <ref type="bibr">(Adams et al., 2011)</ref>. Several proteomics studies have identified a total of 293 proteins that interact with RACK1A <ref type="bibr">(Stark et al., 2006;</ref><ref type="bibr">Olejnik et al., 2011;</ref><ref type="bibr">Kundu et al., 2013;</ref><ref type="bibr">Speth et al., 2013;</ref><ref type="bibr">Cheng et al., 2015;</ref><ref type="bibr">Guo et al., 2019)</ref>, 40 of which (13.7%; p-value &lt; 2.1 &#215; 10 -28 ; hypergeometric test) were identified in at least 2 biological replicates of lncCOBRA1 pulldown in Col-0 (Figure <ref type="figure">5E</ref>). This included RACK1B, another major subunit of RACK1 <ref type="bibr">(Guo and Chen, 2008)</ref>. Nearly 25% of the identified RACK1A-interacting proteins that were identified in ChIRP were specifically bound to lncCOBRA1 in Col-0 compared to lnccobra1-2 (N = 9; Figure <ref type="figure">4B</ref>, yellow dots), providing strong evidence that lncCOBRA1 interacts with RACK1A.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>DISCUSSION</head><p>In this study, we use genetic, biochemical, and proteomic analyses to describe a highly conserved, previously uncharacterized sno-lincRNA with functions in seed germination and development. We reveal that lncCOBRA1 is a &#8764;500-600 nt lincRNA with germination-, developmental-, and tissue-specific patterns of abundance, with high abundance early during seed germination and decreases as development progresses. Further, we demonstrate that loss of lncCOBRA1 results in delayed cotyledon emergence and overall smaller plants. We demonstrate that lncCOBRA1 interacts with a wide variety of proteins, including many nucleolar proteins and scaffold proteins, including the highly conserved RACK1 subunit RACK1A, leading to an overall hypothesis that lncCOBRA1 acts as a scaffold to bring together proteins involved in several different processes to ultimately regulate plant germination and development.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Identification of Highly Conserved, Protein-Bound Nuclear lincRNAs From Transcriptome-Wide Analyses</head><p>Here, we describe a set of lincRNAs named CONSERVED IN BRASSICA RAPA 1-14 (lncCOBRA1-14) that were identified for their interactions with nuclear RBPs and sequence conservation in Brassica rapa (Supplementary Figure <ref type="figure">1A</ref>; <ref type="bibr">Gosai et al., 2015)</ref>. Of the 14 lncCOBRA transcripts profiled, 9 contained one or more snoRNAs annotated within it, revealing a previously unidentified class of lincRNAs containing snoRNAs (sno-lincRNAs) in Arabidopsis (Table <ref type="table">1</ref>). snoRNAs are a  family of conserved nuclear small RNAs (70-200 nt) that are usually concentrated in the Cajal bodies or nucleolus. They traditionally function to modify rRNA or participate in the processing and maturation of ribosomal subunits, where binding of core nucleolar proteins protects the mature snoRNAs and aids in proper function <ref type="bibr">(Rodor et al., 2010)</ref>. Despite having two snoRNA domains, we do not observe any function of lncCOBRA1 in rRNA processing (Supplementary Figure <ref type="figure">4E</ref>), similar to mammalian sno-lncRNAs described previously <ref type="bibr">(Yin et al., 2012)</ref>. We predict that the presence of snoRNA sequences in these lincRNAs likely results in their interaction with RBPs, as the annotated snoRNA domains overlap with the proteinbound sites identified previously, and snoRNA sequences are known to be highly protein-bound. Additionally, since snoRNAs are nuclear retained (Figure <ref type="figure">1</ref>), we predict that the snoRNA sequences contained in these lncCOBRA transcripts permit their nuclear retention, though future experiments are needed to test this hypothesis. Most lncCOBRA transcripts demonstrated specific patterns of abundance during seed germination. Interestingly, lncCOBRA lincRNAs that lacked snoRNA sequences demonstrated the least specificity in abundance patterns during germination (lncCOBRA8, 9, 13, and 14) (Supplementary Figure <ref type="figure">1B</ref>). Ultimately, this suggests that sno-lincRNAs may be important for germination in Arabidopsis, while conserved, protein-bound lincRNAs that lack snoRNAs may function in different biological processes.</p><p>In mammals, the majority of functional snoRNAs are encoded within introns and processed from excised and debranched introns by exonucleolytic trimming. Similarly, all identified mammalian sno-lncRNAs are generated from excised introns as well <ref type="bibr">(Xing and Chen, 2018)</ref>. In Arabidopsis, while identified snoRNAs in Arabidopsis appear to be homologs of yeast and animal counterparts, they are not encoded within introns but are instead primarily transcribed from intergenic regions as polycistronic gene clusters. As such, the lncCOBRA sno-lincRNAs described here are also transcribed from intergenic regions throughout the genome. Thus, lncCOBRA sno-lincRNAs represent a previously uncharacterized class of lincRNAs with potentially important biological functions that warrant future studies.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Regulation of lncCOBRA1 Transcription</head><p>lncCOBRA1 contains several conserved elements within its promoter known to be present in the promoters of genes involved in ribosome biogenesis and translation. This includes Telobox and Site II elements (TeloSII) (Supplementary Figure <ref type="figure">3C</ref>). Interestingly, the Telo-box is known to be bound by the lncCOBRA1-interacting transcription factor PUR ALPHA-1 (PUR&#945;) (Tables <ref type="table">2,</ref><ref type="table">3</ref>; <ref type="bibr">Tremousaygue et al., 1999)</ref>. PUR&#945; is a homolog of the animal nuclear protein PUR ALPHA (PURA) which is a member of the sequence-specific single-stranded nucleic acid-binding Pur family of proteins. The amino acid sequence of Pura is extraordinarily conserved in sequence from bacteria through humans, where it functions as a transcriptional activator, and as an RNA transport protein. While less is known about PUR&#945; in Arabidopsis, it was identified to be an RBP <ref type="bibr">(Bach-Pages et al., 2020)</ref> and was previously demonstrated to interact with TEOSINTE BRANCHED 1, CYCLOIDEA, PCF (TCP)-DOMAIN FAMILY PROTEIN 20 (TCP20) <ref type="bibr">(Tr&#233;mousaygue et al., 2003)</ref>. TCP20 also binds TeloSII elements and regulates expression of ribosomal protein genes <ref type="bibr">(Tr&#233;mousaygue et al., 2003)</ref>. In Arabidopsis, nearly all ribosomal protein genes and other genes involved in ribosome biogenesis and translation contain TeloSII elements in their promoters <ref type="bibr">(Gaspin et al., 2010)</ref>. This combined TeloSII element is found upstream of the TATA box and acts to coordinate expression of snoRNAs and ribosome biogenesis <ref type="bibr">(Qu et al., 2015)</ref>. Thus, the interaction between PUR&#945; and lncCOBRA1 could suggest the lncCOBRA1 binds to PUR&#945; to regulate its own expression. Additionally, the presence of the TeloSII elements in the lncCOBRA1 promoter suggests that lncCOBRA1 may be expressed in a coordinated manner with ribosomal proteins, implicating it in ribosome biogenesis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>lncCOBRA1-Interacting Proteins May Mediate Germination Phenotype Observed in Mutants</head><p>RACK1 is a versatile scaffold protein that can bind to numerous signaling molecules from diverse signal transduction pathways <ref type="bibr">(Guo et al., 2007)</ref>. In Arabidopsis, RACK1 plays an important role in maintaining 60S ribosome biogenesis and 80S monosome assembly, as rack1a rack1b double mutants have a decrease in abundance of the 60S ribosomal subunit and 80S monosomes, but no differences in polysomes, suggesting a role for RACK1 in ribosome biogenesis <ref type="bibr">(Guo et al., 2011)</ref>. Since RACK1A interacts with ribosomal proteins, generally affects translation and responds to several hormones, this suggests that RACK1 has a dual role in signaling and translation, as observed previously for the RACK1 homolog in mammals <ref type="bibr">(Guo et al., 2011)</ref>.</p><p>Additionally, mutants in RACK1A had smaller rosette leaf size and delayed flowering and leaf development under short day conditions (8/16 h photoperiod) <ref type="bibr">(Chen et al., 2006)</ref>. When grown under long day conditions (16/8 h photoperiod), many of the strong phenotypes observed under short day were alleviated and rack1a plants grew at similar rates to wild type, but had slightly smaller rosette leaf size, a phenotype that was exacerbated when additional subunits of RACK1 were deleted <ref type="bibr">(Wang et al., 2019)</ref>. Overall, rack1a plants grown under long day conditions appear to phenocopy lnccobra1 mutants, suggesting a functional link between RACK1A and lncCOBRA1. Moreover, rack1a mutants were hypersensitive to ABA <ref type="bibr">(Chen et al., 2006;</ref><ref type="bibr">Guo et al., 2009</ref><ref type="bibr">Guo et al., , 2011) )</ref> and insensitive to gibberellin (GA) <ref type="bibr">(Chen et al., 2006;</ref><ref type="bibr">Fennell et al., 2012)</ref>, suggesting a role of RACK1A in regulating seed germination and development. Ultimately, the hypersensitivity of rack1a to ABA suggests that RACK1A negatively regulates ABA-mediated seed germination and development.</p><p>Given the evidence of RACK1-lncCOBRA1 interaction (Figures <ref type="figure">4,</ref><ref type="figure">5</ref>) along with similarities in the phenotype of null mutants <ref type="bibr">(Chen et al., 2006;</ref><ref type="bibr">Guo et al., 2019)</ref> and protein binding partners (Figures <ref type="figure">4,</ref><ref type="figure">5</ref>), this provides further evidence of a functional link between RACK1A and lncCOBRA1, suggesting the possibility that lncCOBRA1 functions with RACK1A as a scaffold to regulate plant germination and development. Though future studies are required, we propose a hypothesis that lncCOBRA1 is localized to the nucleolus, where it functions as a scaffold to interact with RACK1A and associated ribosomal proteins to affect ribosome biogenesis. Disruption of lncCOBRA1 abundance results in disruption of the RACK1 complex and its association with ribosomal proteins, resulting in decreased ribosome biogenesis and the phenotypes observed.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RNase J Is the Highest Enriched Protein-Bound to lncCOBRA1</head><p>The protein with the highest enrichment for lncCOBRA1 binding in Col-0 relative to lnccobra1-2 was RIBONUCLEASE J (RNASE J; RNJ) (Figure <ref type="figure">4B</ref>). RNJ encodes a metallo-beta-lactamase protein that possesses endo-and 5 -3 exonuclease activities in bacteria and chloroplasts within plants and is required for embryo and chloroplast development <ref type="bibr">(Halpert et al., 2019)</ref>. While RNase J plays important roles in rRNA maturation and 5 stability of mRNAs in bacteria <ref type="bibr">(Mathy et al., 2007)</ref>, it does not function in the cleavage of polycistronic rRNAs or mRNA precursors in Arabidopsis <ref type="bibr">(Sharwood et al., 2011)</ref>. Instead, loss of RNase J resulted in a massive accumulation of antisense RNAs, suggesting that RNase J is responsible for degradation of these RNAs generated by the inability of chloroplast RNA polymerase to terminate transcript effectively. The antisense RNAs would otherwise form duplexes with sense strand transcripts and prevent translation <ref type="bibr">(Sharwood et al., 2011)</ref>. While RNase J is described to be chloroplast localized, it is also predicted to be located in the nucleus by computational predictions <ref type="bibr">(Kaundal et al., 2010)</ref>. Further, previously, we previously identified a protein thought to be solely chloroplast localized in the nucleus <ref type="bibr">(Gosai et al., 2015)</ref>. Thus, it is possible that RNase J is in the nucleus, though this needs to be directly experimentally validated.</p><p>RNases are essential for non-coding RNA processing and each RNase can have a multitude of targets. For example, RNase P is an endoribonuclease canonically functions to process the 5 termini of pre-tRNAs but can also cleave other tRNA like structures in the 3 end of lncRNAs to form mature 3 ends <ref type="bibr">(Wilusz et al., 2008</ref><ref type="bibr">(Wilusz et al., , 2011;;</ref><ref type="bibr">Sunwoo et al., 2009)</ref>. Additionally, RNase mitochondrial RNA processing (MRP) was originally identified as an RNAprotein endoribonuclease that processes RNA primers of DNA replication in the mitochondria but is actually predominantly found in the nucleolus where it participates in pre-rRNA processing <ref type="bibr">(Lee et al., 1996)</ref>. Thus, it is possible that RNase J possesses additional functions than previously described, possibly mediated by its interaction with lncCOBRA1. Given its function in ribosome maturation in bacteria and the multiple functions of RNases on ncRNAs described previously, we posit that RNase J may have additional function in sno-lincRNA processing in Arabidopsis, specifically the 3 end processing we observe for lncCOBRA1, but future studies will be required to support this hypothesis.</p><p>In total, using transcriptome-wide analyses we identified functional candidate lncRNAs based on sequence conservation and the presence of RBP binding sites. We further show the loss of lncCOBRA1 results in growth phenotypes. While future studies are required, we provide evidence that lncCOBRA1 interacts with a plethora of proteins involved in many different processes. Overall, we hypothesize that lncCOBRA1 acts as a scaffold to bring together many different proteins to regulate normal biological processes, including ribosome biogenesis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>MATERIALS AND METHODS</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Plant Materials and Growth Conditions</head><p>All plants were of the Columbia-0 ecotype and were grown in controlled chambers with a cycle of 16 h light and 8 h dark at 22 &#8226; C. All seeds used for plate growth were sterilized in 100% ethanol for 1-min followed by a 10-min wash with 30% Clorox and 0.01% Tween-20 solution and rinsed five times with sterilized water. Seeds were then plated and grown on 1/2 MS agar plates with 1% sucrose and 0.8% Phytoblend and stratified by cold treating at 4 &#8226; C for 48 h then placed in growth chambers with the parameters noted above.</p><p>lncCOBRA1 was previously referred to as AT1NC031460 in <ref type="bibr">Liu et al. (2012)</ref> and AT1G05913 in the Araport11 genome annotation. lnccobra1-1 (SALK_086689) was purchased from the Arabidopsis Biological Resource Center and backcrossed once to Col-0, segregated, and homozygous mutants obtained and validated by PCR. RT-qPCR was used to validate significant depletion in the abundance of lncCOBRA1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>CRISPR/Cas9 Plasmid Construction and Mutation Identification</head><p>To generate lnccobra1-2, the suite of plasmids designed for multiplexed CRISPR genome editing by <ref type="bibr">Lowder et al. (2015)</ref> were acquired from Addgene<ref type="foot">foot_2</ref> and used to generate Arabidopsis CRISPR-Cas9 transformation vectors <ref type="bibr">(Lowder et al., 2015)</ref>. Two different guide RNAs were designed using the CRISPRdirect website<ref type="foot">foot_3</ref> targeting AT1G05913. Because Cas9 was chosen to perform genome editing, 5 -NGG-3 was used as the protospacer adjacent motif (PAM) sequence requirement. The Arabidopsis thaliana TAIR10 genome was used to ensure the specificity of chosen guide RNAs. The first guide RNA (protospacer sequence: 5 -TATGATTTGATCATCATCGG-3 ) is located approximately 50 base pairs upstream of the AT1G05913 transcription start site, and the second guide RNA (protospacer sequence: 5 -TATATGGCTCTGGAAGAGGG-3 ) is located approximately 121 base pairs downstream of the AT1G05913 transcription start site. Complimentary oligos were designed for each protospacer that contained overhangs compatible with the Arabidopsis U6 promoter driven guide RNA vectors designed by <ref type="bibr">Lowder et al. (2015)</ref> (vectors pYPQ131-pYPQ134) <ref type="bibr">(Lowder et al., 2015)</ref>.</p><p>To generate a CRISPR-Cas9 transformation vector containing two guide RNAs targeting AT1G05913, the cloning procedures provided by <ref type="bibr">Lowder et al. (2015)</ref> were followed <ref type="bibr">(Lowder et al., 2015)</ref>. Briefly, each protospacer sequence described above was annealed using complimentary oligos to create a double stranded DNA fragment and then ligated into the vectors pYPQ131 and pYPQ132, respectively. pYPQ131 and pYPQ132 with correctly inserted protospacer sequences were used in a Golden Gate assembly reaction with pYPQ142 to generate a Gateway-compatible entry vector. The pYPQ142 vector with both guide RNAs correctly inserted, along with pYPQ154 carrying an Arabidopsis codon optimized Cas9, and pUBQ10:GW (Stock CD3-1947 from the Arabidopsis Biological Resource Center) were used in a Gateway LR reaction (Thermo Fisher Scientific; Carlsbad, CA, United States) to generate the final transformation vector. The final vector was transformed into wild type Arabidopsis thaliana (Col-0) using the floral dip method <ref type="bibr">(Clough and Bent, 1998)</ref>.</p><p>Successful transformants were selected using Glufosinateammonium and allowed to set seed to acquire second generation transformants (T2). T2 plants were genotyped to test for a deletion in AT1G05913 using the PCR primers 5 -CGCTTGTTCAACTCCAAAAAG-3</p><p>and 5 -TTTTGGTATATAAGCTGATGGC-3 . A large band shift was detected in one T2 plant (wild type product size: 1,600 bp, observed product size: approximately 200 bp) (Supplementary Figure <ref type="figure">4A</ref>), and Sanger sequencing confirmed the deletion to be 1,325 bp. All primers are listed in Supplementary Data Set 2.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Plasmid Construction and Generation of lncCOBRA1/lnccobra1-1</head><p>To generate lncCOBRA1 promoter: lncCOBRA1/lnccobra1-1, the entire 1509 bp between the two neighboring genes was amplified from Col-0 genomic DNA and cloned into BspEI and BstEII restriction enzyme sites of pCAMBIA3301. Transgenic plants were obtained and selected as previously described <ref type="bibr">(Zhang et al., 2006)</ref>. All primers are listed in Supplementary Data Set 2.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RNA Extraction</head><p>RNA was extracted from the tissues denoted using a liquid nitrogen cooled mortar and pestle. Ground, frozen tissue was transferred to Qiazol lysis reagent (Qiagen; Valencia, CA, United States) and further homogenized using QIAshredders (Qiagen; Valencia, CA, United States). RNA was then isolated using the miRNeasy mini columns as described by the manufacturers' protocol (Qiagen; Valencia, CA, United States). Following elution from the miRNeasy column, RNA was treated with RNase-free DNase (Qiagen, Valencia, CA, United States) for 25 min at room temperature, ethanol precipitated and resuspended in nuclease-free water supplemented with 1.25% RNaseOUT (Life Technologies; Carlsbad, CA, United States).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RT-qPCR</head><p>All reverse transcription (RT) reactions were performed using SuperScript II following the manufacturers' instructions with 2.5 mM Random Hexamers (Qiagen; Valencia, CA, United States), 100 units SuperScript II and 30 units RNaseOUT (Invitrogen; Carlsbad, CA, United States) for 2 min at 25 &#8226; C, 90 min 42 &#8226; C, 5 min 95 &#8226; C, hold at 4 &#8226; C. Before qPCR, cDNA was diluted 1:10 for all RT-qPCR reactions except for ChIRP in which the RT reaction was diluted 1:5. qPCR was performed with 2X SYBR Green qPCR Master Mix with Rox #2 (Bimake; Houston, TX, United States), as follows per well: 10 &#181;L 2X SYBR Green Master Mix, 1.5 &#181;L cDNA (diluted 1:10), 0.4 &#181;L Rox #2. 2.1 &#181;L water, 6 &#181;L combined 1.5 &#181;M forward and reverse primers. All qPCR reactions were performed in three technical replicates and all primers tested using water to detect background signal and melt curves were analyzed for a single peak. All qPCRs were run using the following program: 95 &#8226; C for 10 min; 40 cycles of 95 </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Isolation of Nuclei Tagged in Specific Cell Types</head><p>To examine RNA abundance in nuclei and cytoplasmic fractions, seeds ubiquitously expressing a biotin ligase receptor peptide fusion protein that is targeted to the nuclear envelope (UBQ10:NTF/ACT2p:BirA Columbia-0 ecotype) were used <ref type="bibr">(Deal and</ref><ref type="bibr">Henikoff, 2010, 2011)</ref>. After 7 days, seedlings were collected, and flash frozen in liquid nitrogen and stored at -80 &#8226; C for further processing. The isolation of nuclei tagged in specific cell types (INTACT) <ref type="bibr">(Deal and</ref><ref type="bibr">Henikoff, 2010, 2011)</ref> technique was used to isolate pure nuclear and cytoplasmic fractions and RNA extracted before RT and qPCR as described above.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Tissue Collection</head><p>For the germination time course, seedlings were collected 2, 3, 4, 5, 7, and 10 days after stratification and flash frozen in liquid nitrogen and stored in -80 &#8226; C for further processing. Tissues from 5-week-old Col-0 plants were collected, flash frozen in liquid nitrogen, and stored in -80 &#8226; C until processing for examining the tissue specificity of lncCOBRA1 abundance. The sample of adult leaves included a mix of rosette leaves older than leaves 1-4 which were denoted juvenile leaves.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Brassicaceae lncCOBRA1 Sequence Alignments</head><p>To identify putative sequence homologs of the AT1G05913 gene, the entire Arabidopsis cDNA sequence was used as query for BLAST using CoGeBlast<ref type="foot">foot_4</ref> using default parameters (E-value: 1e-5, Word size: 8, Gap Costs: Existence-5 Extension-2, Match/Mismatch Scores: 1, -2) against representative Brassicaceae species. The top hits for each species were selected based on e-value and quality score and used for subsequent sequence alignments. Selected sequences were aligned using Geneious Prime (Geneious | Bioinformatics Solutions for the Analysis of Molecular Sequence Data, 2019) with the Multiple Alignment tool, utilizing the Geneious Alignment default</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Germination</head><p>For germination experiments, seeds of Col-0, lnccobra1-1, lnccobra1-2, and lncCOBRA1/lnccobra1-1 were sterilized in 100% ethanol for 1-min followed by a 10-min wash with 30% Clorox and 0.01% Tween-20 and washed 5X with sterilized water. Seeds were then plated on 1/2 MS agar plates with 1% sucrose and 0.8% Phytoblend and stratified by cold treating at 4 &#8226; C for 48 h and placed in growth chambers. Two days after transfer to growth chambers, the number of seeds that that displayed cotyledons entirely emerged from the seed coat were counted. Plates were then allowed to grow for 3 more days and 5-day-old seedlings were collected to measure lncCOBRA1 abundance.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Leaf Initiation Rate</head><p>Col-0, lnccobra1-1, lnccobra1-2, and lncCOBRA1/lnccobra1-1 were grown in soil as described above. Every day at &#8764;11 AM the presence of leaf primordia was examined. Leaf initiation was measured when the leaf primordia was visible to the eye (&#8764;0.5 mm). After 3 weeks, plants were weighed for fresh weight measurements. To measure plant size, 3-week-old plants were taped flat on paper, scanned, and analyzed using ImageJ as follows. Scanned images were first converted to 8-bit and processed into a binary image such that any plant tissue was converted to white and background became black. Threshold was set using default settings, inverted, and the "particles" (plants) perimeter and area measured. Area of leaf 3 was selected by hard and measured.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Chromatin Isolation by RNA Purification</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Probe Design, Crosslinking and Chromatin Isolation</head><p>Chromatin isolation by RNA purification probes were designed using the Stellaris probe website<ref type="foot">foot_5</ref> with a 3 Biotin TEG. 5-dayold Col-0 and lnccobra1-2 seedlings were crosslinked in PBS with 1% formaldehyde (v/v) (Sigma-Aldrich; St. Louis, MO, United States) added and placed under vacuum for 10 min, followed by a 5-min quench with 125 mM Glycine under vacuum. Crosslinked tissue was then washed five times in distilled, deionized water, patted dry with paper towels, flash frozen in liquid nitrogen, and stored at -80 &#8226; C until further processing. Chromatin from 6 g of 5-day-old Col-0 and lnccobra1-2 crosslinked seedlings (3 g scrambled probes and 3 g lncCOBRA1 probes) was isolated as previously described <ref type="bibr">(Do et al., 2019)</ref>. All probes are listed in Supplementary Data Set 2.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Bead Preparation</head><p>Pierce High Capacity Streptavadin Agarose beads (Thermo Fisher Scientific; Carlsbad, CA, United States) were first chemically treated to protect the streptavidin from tryptic proteolysis in preparation for mass spectrometry to reduce streptavidin signal as previously described <ref type="bibr">(Barshop et al., 2019)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Chromatin Isolation by RNA Purification</head><p>Chromatin isolation by RNA purification was performed as previously described <ref type="bibr">(Chu et al., 2011</ref><ref type="bibr">(Chu et al., , 2012</ref><ref type="bibr">(Chu et al., , 2015))</ref>, with several modifications. Modified Pierce High Capacity Streptavadin Agarose beads (Thermo Fisher Scientific; Carlsbad, CA, United States) were first washed twice and resuspended in nuclei lysis buffer (50 mM Tris-HCl pH = 8.0, 10 mM EDTA, 1% SDS) supplemented with cOmplete Protease Inhibitor Cocktail (Sigma, St Louis, MO, United States) and RNaseOUT (Invitrogen, Carlsbad, CA, United States). Chromatin lysates were then pre-cleared with 30 &#181;L modified beads for 30 min with mixing in a 37 &#8226; C hybridization oven with rotation. After pre-clearing, samples were centrifuged twice at 3000 RPM for 5 min at room temperature (RT) to thoroughly remove any beads, and 10% of the sample was removed for both RNA input and protein input. The lysates were then split into a scrambled and lncCOBRA1 probe sample and 2X Hybridization buffer was added (750 mM NaCl, 1% SDS, 50 mM Tris-HCl pH = 7.5, 1 mM EDTA, 15% Formamide) supplemented with PMSF (100 &#181;L/10 mL), RNaseOUT (5 &#181;L/10 mL; Invitrogen, Carlsbad, CA, United States), and cOmplete Protease Inhibitor Cocktail (Sigma, St Louis, MO, United States). 100 pmol of probes were then added per 1 mL chromatin (i.e., 1.67 &#181;L for each of the 6 probes used for lncCOBRA1) and the samples incubated in a 37 &#8226; C hybridization oven with rotation.</p><p>After 5 h, 100 &#181;L of modified beads were added to each tube and incubated in a 37 &#8226; C hybridization oven with rotation for another 2 h. Samples were then centrifuged for 5 min at 3000 RPM, supernatant removed, and resuspended in 1 mL wash buffer (2S SSC, 0.5% SDS) pre-warmed to 37 &#8226; C and incubated in a 37 &#8226; C hybridization oven with rotation for another 30 min. Samples were washed for a total of four washes. After the last spin, samples are resuspended in 1 mL wash buffer and 150 &#181;L removed for RNA isolation and the remaining 850 &#181;L used for mass spectrometry.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RNA Isolation</head><p>RNA isolation was performed using a modified version of a previously published protocol <ref type="bibr">(Desvoyes et al., 2018)</ref>. RNA samples were centrifuged at 3000 RPM for 5 min and resuspended in 400 &#181;L RNA proteinase K buffer (PK Buffer; 100 mM NaCl, 10 mM Tris-HCl pH = 7.5, 1 mM EDTA, 0.5% SDS) and 390 &#181;L PK buffer was added to RNA input samples. To reverse crosslinks, NaCl was added to a final concentration of 200 mM (add 8 &#181;L 5M NaCl) and incubated at 65 &#8226; C overnight. The following day 16 &#181;L 1M Tris-HCl pH = 6.8, 8 &#181;L 0.5 M EDTA and 2 &#181;L proteinase K (Denville Scientific; Metuchen, NJ, United States) was added and incubated at 37 &#8226; C for 2 h with rotation to remove proteins. Samples were then added to 700 &#181;L Qiazol (Qiagen; Valencia, CA, United States), and RNA isolated as described above. All primers are listed in Supplementary Data Set 2.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>ChIRP-MS qPCR Validation</head><p>qPCR was performed as described above with the following exceptions. A standard curve for all primer sets was generated using serial dilutions of genomic DNA. "Copy number" of each transcript was calculated, and normalized by the average C T value for three technical replicates of U6 for each sample. The normalized values were then used to calculate fold enrichment relative to Col-0 input.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Mass Spectrometry Sample Preparation and Acquisition</head><p>Protein samples were centrifuged at 3000 RPM for 5 min, supernatant removed, and the beads were wash three times with 100 mM NH 4 HCO 3 , and ultimately resuspended in 400 &#181;L 100 mM NH 4 HCO 3 supplemented with 200 mM NaCl and incubated overnight at 65 &#8226; C to reverse crosslinks. The next day the samples were flash frozen in liquid nitrogen and stored at -80 &#8226; C until processing. Samples were thawed on ice and resuspended in an appropriate volume of the resuspension buffer [50 mM SDS and 50 mM triethylammonium bicarbonate (TEAB) final concentrations] and reduced with final 10 mM DTT (US Biological, Salem, MA, United States) for 30 min at 30 &#8226; C, followed by alkylation with final 50 mM iodoacetamide (Sigma, St Louis, MO, United States) for 30 min at 30 &#8226; C. The samples were processed using an S-Trap TM column according to the protocol recommended by the supplier (Protifi; Farmingdale, NY, United States; C02-mini): loaded onto the column and digested with trypsin (Thermo Fisher Scientific; Carlsbad, CA, United States) in 1:10 (w/w) enzyme/protein ratio for 1 h at 47 &#8226; C.</p><p>Peptides eluted from this column were vacuum-dried and resuspended with LC-MS grade water containing 0.1% (v/v) TFA for mass spectrometry analysis. Each sample was analyzed by a Q-Exactive HF mass spectrometer (Thermo Fisher Scientific; Carlsbad, CA, United States) coupled to a Dionex Ultimate 3000 UHPLC system (Thermo Fisher Scientific; Carlsbad, CA, United States) equipped with an in-house made 15 cm long fused silica capillary column (75 &#181;m ID), packed with reversed-phase Repro-Sil Pur C18-AQ 2.4 &#181;m resin (Dr. Maisch; GmbH, Ammerbuch, Germany) column. Elution was performed using a gradient from 5 to 35% B (50 min), followed by 90% B (10 min), and re-equilibration from 90 to 5% B (5 min) with a flow rate of 400 nL/min (mobile phase A: water with 0.1% formic acid; mobile phase B: 80% acetonitrile with 0.1% formic acid). Data were acquired in data-dependent MS/MS mode. Full scan MS settings were: mass range 200-1500 m/z, resolution 120,000; MS1 AGC target 1E6; MS1 Maximum IT 100. MS/MS settings were: resolution 30,000; AGC target 5E4; MS2 Maximum IT 200 ms; fragmentation was enforced by higher-energy collisional dissociation with stepped collision energy of 25, 27, 30; loop count top 15; isolation window 1.4; fixed first mass 120; MS2 Minimum AGC target 2E3; charge exclusion: unassigned, 1, 7, 8, and &gt;8; peptide match preferred; exclude isotope on; dynamic exclusion 45 s.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Mass Spectrometry Data Analysis</head><p>The acquired data were processed via Proteome Discoverer 2.4 with the default QExactive Precursor Quant and LFQ MPS with SequestHT and Percolator processing template and Comprehensive Enhanced Annotation LFQ and Precursor Quant consensus template with the following parameters. The spectra match with peptide sequence was performed with SequestHT with contaminants.fasta from MaxQuant and Arabidopsis thaliana fasta 2019.04 release, full tryptic digestion, maximum missed cleavage 3, peptide length between 6 and 144, MS1 mass tolerance 10 ppm, MS2 mass tolerance 0.02 Da, dynamic modification with oxidation on methionine, acetylation and methionine loss on protein N-terminal, static modification with carbamidomethyl on cysteine. The protein inference and identification validation was performed with Percolator with a 1% false discovery rate (FDR) cut off. Normalization was performed by total peptide amount and scaling mode was set to on all average. Protein Abundance was peptide summed with the top three most abundant peptides for each protein.</p><p>Proteins were first filtered such that only proteins with abundance scores in at least two biological replicates of the pulldown with the lncCOBRA1 probes in Col-0 background were considered. Protein abundance in lncCOBRA1 pulldown was then normalized by the average protein abundance identified using the scrambled probes (lncCOBRA1/scrambled) and only proteins that were enriched with the lncCOBRA1 probes compared to the scrambled sequence probes were examined further (COBRA1/scrambled &gt; 1). Enrichment of Col-0 over lnccobra1-2 background was then calculated as the log 2 [Col-0/lnccobra1-2] and proteins enriched over 1-fold were classified as lncCOBRA1-interacting and used for future analyses. We also examined proteins that were present in at least 2 biological replicates of the lncCOBRA1 pulldown in Col-0 tissue, but absent from scrambled and in 0 or 1 biological replicate of the lncCOBRA1 pulldown in the lnccobra1-2 background. Since no protein abundances were found in the scrambled, a fold enrichment could not be calculated.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Protein-Protein Interaction Network</head><p>STRING<ref type="foot">foot_6</ref> was used to generate the protein-protein interaction network with medium stringency and clustered into five clusters by the k-means clustering algorithm provided. PPI enrichment was calculated by the STRING program <ref type="bibr">(Szklarczyk et al., 2019)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>SIGNIFICANCE STATEMENT</head><p>Long non-coding RNAs (lncRNAs) are an important yet understudied class of molecules in all eukaryotic organisms. While thousands of lncRNAs have been identified, only a handful have described functions. In this work, the authors describe a previously uncharacterized and highly conserved lncRNA that contains two snoRNA domains, and functions to affect Arabidopsis germination and growth. Overall, the authors describe sno-lncRNAs for the first time in a nonmammalian system and demonstrate novel mechanisms of lncRNA function in Arabidopsis. and accession number(s) can be found below: The ProteomeXchange Consortium via the PRIDE (1) partner repository with the dataset identifier PXD033707.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>May 2022 | Volume 13 | Article 906603</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_1"><p>Frontiers in Plant Science | www.frontiersin.org</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_2"><p>https://www.addgene.org</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_3"><p>https://crispr.dbcls.jp Frontiers in Plant Science | www.frontiersin.org</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_4"><p>https://genomevolution.org/CoGe/CoGeBlast.pl Frontiers in Plant Science | www.frontiersin.org</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_5"><p>https://www.biosearchtech.com/support/tools/design-software/chirp-probedesigner</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_6"><p>https://string-db.org/cgi/input.pl?sessionId=QFOfEIHPOaYj&amp;input_page_ show_search=on Frontiers in Plant Science | www.frontiersin.org</p></note>
		</body>
		</text>
</TEI>
