<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Single-cell transcriptome atlases of soybean root and mature nodule reveal new regulatory programs that control the nodulation process</title></titleStmt>
			<publicationStmt>
				<publisher>CellPress</publisher>
				<date>08/01/2024</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10553943</idno>
					<idno type="doi">10.1016/j.xplc.2024.100984</idno>
					<title level='j'>Plant Communications</title>
<idno>2590-3462</idno>
<biblScope unit="volume">5</biblScope>
<biblScope unit="issue">8</biblScope>					

					<author>Sergio Alan Cervantes-Pérez</author><author>Prince Zogli</author><author>Sahand Amini</author><author>Sandra Thibivilliers</author><author>Sutton Tennant</author><author>Md Sabbir Hossain</author><author>Hengping Xu</author><author>Ian Meyer</author><author>Akash Nooka</author><author>Pengchong Ma</author><author>Qiuming Yao</author><author>Michael J Naldrett</author><author>Andrew Farmer</author><author>Olivier Martin</author><author>Samik Bhattacharya</author><author>Jasper Kläver</author><author>Marc Libault</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[The soybean root system is complex. In addition to being composed of various cell types, the soybean root system includes the primary root, the lateral roots, and the nodule, an organ in which mutualistic symbiosis with N-fixing rhizobia occurs. A mature soybean root nodule is characterized by a central infection zone where atmospheric nitrogen is fixed and assimilated by the symbiont, resulting from the close cooperation between the plant cell and the bacteria. To date, the transcriptome of individual cells isolated from developing soybean nodules has been established, but the transcriptomic signatures of cells from the mature soybean nodule have not yet been characterized. Using single-nucleus RNA-seq and Molecular Cartography technologies, we precisely characterized the transcriptomic signature of soybean root and mature nodule cell types and revealed the co-existence of different sub-populations of B. diazoefficiensinfected cells in the mature soybean nodule, including those actively involved in nitrogen fixation and those engaged in senescence. Mining of the single-cell-resolution nodule transcriptome atlas and the associated gene co-expression network confirmed the role of known nodulation-related genes and identified new genes that control the nodulation process. For instance, we functionally characterized the role of GmFWL3, a plasma membrane microdomain-associated protein that controls rhizobial infection. Our study reveals the unique cellular complexity of the mature soybean nodule and helps redefine the concept of cell types when considering the infection zone of the soybean nodule.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>INTRODUCTION</head><p>Legumes engage in a mutualistic symbiotic interaction with rhizobia, a group of nitrogen-fixing soil bacteria (e.g., Bradyrhizobium diazoefficiens for soybean) <ref type="bibr">(Li et al., 2020)</ref>. This biological process, termed nodulation, has economic and ecological impacts on agriculture, as it helps to mitigate the application of nitrogen fertilizers, reduces environmental pollution, and supports sustainable agricultural practices. For most legume species (e.g., Medicago truncatula, Lotus japonicus, Glycine max, Phaseolus vulgaris, Pisum sativum), the nodulation process is initiated by rhizobial infection of the root hair cells. Concomitant with this infection, the root inner cortical cells engage in de novo cell divisions, leading to the formation of nodule primordia. Ultimately, a new root organ, the nodule, emerges, in which differentiated bacteria called bacteroids fix and assimilate atmospheric nitrogen for the plant <ref type="bibr">(Udvardi and Poole, 2013;</ref><ref type="bibr">De La Pen &#732;a et al., 2018)</ref>. On the basis of their organogenesis and cellular organization, mature legume nodules are generally classified into two types: indeterminate and determinate <ref type="bibr">(Brewin, 1991;</ref><ref type="bibr">Ferguson et al., 2010)</ref>. Mature indeterminate nodules (e.g., M. truncatula and P. sativum) can be divided into four biologically and microscopically distinct zones. In zone I, at the tip of the nodule, a permanent nodule meristem persists. In zone II, rhizobia infect the plant cells. Zone III is the nitrogen fixation zone, where bacteroids fix and assimilate atmospheric nitrogen for the plant. In zone IV, the nodule cells senesce. Unlike indeterminate nodules, mature determinate nodules (e.g., G. max, L. japonicus, and P. vulgaris) are not organized into visually distinct zones associated with different stages of interaction between plant cells and rhizobia. As a result, all the plant cells colonized by rhizobia are located in the center of the nodule, and, over its lifetime, the determinate nodule will senesce outwards from the center <ref type="bibr">(Puppo et al., 2005)</ref>. At the molecular level, senescing nodule cells are characterized by induction of the expression of a NAC/ CYP regulatory module and decreases in leghemoglobin content and nitrogenase activity <ref type="bibr">(Buono et al., 2019;</ref><ref type="bibr">Doll, 2023;</ref><ref type="bibr">Yu et al., 2023)</ref>.</p><p>During the past 20 years, numerous -omics studies of different legume species have led to the identification of many genes that control the nodulation process, including those involved in the symbiosis between infected nodule cells and bacteroids <ref type="bibr">(Libault et al., 2010a;</ref><ref type="bibr">Breakspear et al., 2014;</ref><ref type="bibr">Clarke et al., 2015;</ref><ref type="bibr">Veli ckovi c et al., 2018;</ref><ref type="bibr">Mergaert et al., 2020;</ref><ref type="bibr">Roy et al., 2020;</ref><ref type="bibr">Shimoda et al., 2020;</ref><ref type="bibr">Islam et al., 2022)</ref>. The emergence of high-throughput sequencing technologies has led to the construction of several legume transcriptomic atlases that have notably revealed differences in the transcriptomic profiles of the root and the nodule and have enabled the identification of hundreds of nodule-specific genes <ref type="bibr">(Benedito et al., 2008;</ref><ref type="bibr">Libault et al., 2010c;</ref><ref type="bibr">Severin et al., 2010;</ref><ref type="bibr">Verdier et al., 2013)</ref>. For example, the transcriptome of each of the four zones of the M. truncatula nodule was established using laser microdissection <ref type="bibr">(Roux et al., 2014)</ref>. This technology has recently been superseded by emerging single-cell and singlenucleus transcriptomic technologies (sc-and sNucRNA-seq) <ref type="bibr">(Denyer et al., 2019;</ref><ref type="bibr">Jean-Baptiste et al., 2019;</ref><ref type="bibr">Lee et al., 2019;</ref><ref type="bibr">Ryu et al., 2019;</ref><ref type="bibr">Wendrich et al., 2020;</ref><ref type="bibr">Liu et al., 2021;</ref><ref type="bibr">Xu et al., 2021;</ref><ref type="bibr">Shahan et al., 2022)</ref>. For instance, single-cell RNA-seq was recently used to capture the transcriptomic profile of indeterminate M. truncatula nodule cells and confirmed the zone-specific transcriptomic programs of infected nodule cells <ref type="bibr">(Ye et al., 2022)</ref>. A similar approach was recently applied to developing and maturing soybean nodules (i.e., 12-, 14-, and 21-days post rhizobial inoculation <ref type="bibr">[dpi]</ref>; <ref type="bibr">Liu et al., 2023;</ref><ref type="bibr">Sun et al., 2023)</ref>. Here, to complement these studies, we report the use of single-nucleus RNA-seq technology on soybean roots and 28-dpi nodules, which are transitioning to their senescence phase as reported by <ref type="bibr">Yu et al. (2023)</ref>.</p><p>Our study provides a new perspective on the cellular and molecular complexity of the infection zone of the soybean nodule. Specifically, we report that different transcriptomic programs are specifically activated in three different sub-populations of rhizobial-infected soybean cells: those not actively fixing atmospheric dinitrogen, those fixing and assimilating atmospheric nitrogen, and those already engaged in senescence. Single-cellresolution gene co-expression networks not only reveal interactions between known nodulation-related genes but also support the identification of new candidate genes that control the symbiosis between soybean cells and rhizobia. For instance, we provide functional evidence for the role of GmFWL3, a homolog of the plasma membrane microdomain-encoding gene GmFWL1 <ref type="bibr">(Thibivilliers et al., 2020a)</ref>, in control of soybean cell infection by B. diazoefficiens. Our study reveals the unique cellular complexity of the mature soybean nodule, enabling a deeper understanding of the molecular processes that govern nodulation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RESULTS</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Single-cell-resolution transcriptome atlas of the soybean root</head><p>To establish the transcriptomic profile of each cell type in the soybean root, we applied sNucRNA-seq technology to three independent soybean root replicates (Supplemental Figure <ref type="figure">1</ref>). After independent processing to eliminate doublets and background contamination (see Supplemental Figure <ref type="figure">1A</ref>-1C), we found that the transcriptomes of the three root replicates were strongly correlated (Supplemental Figure <ref type="figure">1D</ref>). Subsequently, the replicates were integrated, followed by dimensional reduction (see methods). The single-nucleus root transcriptome atlas is composed of 14 369 high-quality nuclei, captures an average of 1949 unique molecular identifiers (UMIs) and 1363 expressed genes per nucleus, and covers the expression of 75.8% of the predicted soybean protein-coding genes (42 390 out of 55 897; Supplemental Table <ref type="table">1</ref>). This root atlas nicely overlaps with but also better covers the previously reported bulked transcriptomes of the soybean root and root tip, which identified 39 709 and 36 354 expressed genes, respectively <ref type="bibr">(Libault et al., 2010c) (i.e., 88.1% and 81</ref>.6% of the expressed genes identified using sNucRNA-seq were also identified in the root and root-tip bulk transcriptomes, respectively; Supplemental Figure <ref type="figure">2A</ref>). After application of the Uniform Manifold Approximation and Projection (UMAP) dimensional reduction technique, the root nuclei were distributed in 16 distinct cell clusters according to their transcriptomic profiles (Figure <ref type="figure">1A</ref>). Except for root hair cell cluster 3, which is characterized by slightly higher numbers of UMIs and expressed genes, the remaining 15 clusters have similar transcriptional activities (Figure <ref type="figure">1B</ref>). From the saturation of our sequencing, we estimated that the transcriptome coverage of each root cluster varies from 57.5% (i.e., cluster 15) to 99.1% (i.e., clusters 2, 8, and 9). Not surprisingly, this coverage depends on the size of the population of nuclei per cluster. Except for clusters 11 (123 nuclei) and 15 (89 nuclei), the transcriptomic coverage of the root clusters is greater than 85% (Supplemental Table <ref type="table">2</ref>). By examining the expression of the 55 897 protein-coding genes across the 16 root clusters, we identified 14 088 ubiquitously expressed genes (i.e., genes found expressed in all clusters). Among them, 2753 are constitutively expressed across the 16 clusters (i.e., less than a 4-fold change in activity between the clusters in which the gene shows the highest and lowest expression) (Supplemental Table <ref type="table">1</ref>). Applying very stringent criteria (i.e., fold-change &gt;20 between the two clusters in which the gene is most highly expressed; minimum expression of 0.1 UMIs for the gene considered/10 000 sequenced UMIs, expressed in at least 20% of nuclei in the cluster where the gene was found to be specifically expressed), we also identified 424 root cell-type marker genes (Supplemental Table <ref type="table">1</ref>). Inner/outer cortex Vascular endodermis Vascular bundle Infection cells Uninfected cells Inner/outer cortex Vascular endodermis Sclereid layer Vascular bundle Infection cells Uninfected cells Sclereid layer Senescing cells Inner/outer cortex Vascular Endodermis Sclereid layer Vascular bundle Infection cells A B C E F G H Transcript density (number of transcript per pixel) 7.63e-06 1.22e-04 1.95e-03 3.12e-02 10 bacterial genes Glyma.17g195900 I J Uninfected Cells D A B C D E F G H I J K A B C D E F G H I J K A K B C D E F G H I J A K B C D E F G H I J A B C D E F G H I J K Figure 2. Establishment of a single-nucleus transcriptome atlas of the soybean nodule. (A) UMAP plot of 7830 soybean nodule nuclei based on their transcriptomic profiles. The nuclei were clustered into 11 different groups (clusters A to K). (B) Distribution of the number of UMIs and expressed genes per nodule cluster (Tukey's test with p &lt; 0.05 reported to highlight differences between clusters).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>(legend continued on next page)</head><p>To functionally annotate the root clusters, we first used Molecular Cartography (M.C.) technology developed by Resolve Biosciences, a multiplexed and high-resolution RNA in situ hybridization technique, on a cross-section of the soybean root. Specifically, we analyzed the transcriptional activity of 52 soybean cluster-specific genes selected from the 16 root clusters in the morphological context of the soybean root. The transcriptional patterns of these genes led to annotation of the epidermal, cortical, endodermal, pericycle, cambial, xylem, and phloem clusters (Figure <ref type="figure">1C</ref> and 1E and Supplemental Figure <ref type="figure">3</ref>). We assume that the small population of nuclei in cluster 15 and its lower transcriptomic coverage compared with other root clusters (Supplemental Table <ref type="table">2</ref>) result from the enucleation of phloem cells during their maturation. To accurately annotate the root hair cell cluster, which is a difficult cell type to assess from our M.C. experiments because of its unique morphology and peripheral localization in the root, we also examined the activity of previously and newly functionally validated root-hairspecific genes (Figure <ref type="figure">1D</ref> and Supplemental Figure <ref type="figure">4</ref>). To annotate the soybean ''root cap'' and ''stem cell niche'' clusters, cell types not represented on the root M.C. crosssection, we analyzed the activity of soybean genes orthologous to M. truncatula and Arabidopsis thaliana marker genes (Figure <ref type="figure">1D</ref> and Supplemental Figure <ref type="figure">4</ref>; Supplemental Table <ref type="table">3</ref>). Using the same strategy, we also confirmed the identity of clusters 10 and 11 as ''endodermal'' clusters and refined the identity of cluster 13 as the ''pericycle'' cluster. We thus functionally annotated the 16 soybean root clusters and experimentally validated the expression of a large collection of new marker genes of soybean root cell types. By examining the distribution of the 424 single-cell-type marker genes across the 16 annotated soybean root clusters, we found that these genes were restricted to 9 clusters (i.e., root hair cluster 3, root cap 4, dividing cells 6, cortex 7, endodermis 10 and 11, xylem 14, phloem fiber 15, and phloem 16; Supplemental Table <ref type="table">1</ref>), likely reflecting the biological specialization of these cell types.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Single-cell-resolution transcriptome atlas of the soybean nodule</head><p>The soybean nodule and the physiology of its cells change over time, a consequence of its continuous development from initiation of the nodule primordium to senescence of the mature nodule. To complement recent single-cell transcriptomic studies performed on the developing stages of the soybean nodule (i.e., 12, 14, and 21 dpi) <ref type="bibr">(Liu et al., 2023;</ref><ref type="bibr">Sun et al., 2023)</ref>, we established a single-cell transcriptome atlas of 28-dpi mature soybean nodules. The 28-dpi nodule atlas was generated from two independent replicates (Supplemental Figure <ref type="figure">5</ref>). As we did for the ''root'' sample, we independently processed the ''nodule'' replicates to eliminate doublets and background contamination (see Supplemental Figure <ref type="figure">5A</ref> and 5B), confirmed the correlation between the two replicates (Supplemental Figure <ref type="figure">5C</ref>), and performed dimensional reduction upon integration (see methods). The nodule atlas is composed of 7830 nuclei with an average of 1058 UMIs and 647 expressed genes per cell and a total of 37 119 expressed genes ($66.4% of the soybean protein-coding genes) (Supplemental Table <ref type="table">1</ref>). Like the root datasets, the sNucRNA-seq datasets from the soybean nodules overlap well with a previously published soybean nodule bulk transcriptome <ref type="bibr">(Libault et al., 2010c)</ref> (i.e., 87.3% of the expressed genes identified using sNucRNA-seq technology were also identified in the nodule bulk transcriptome; Supplemental Figure <ref type="figure">2C</ref>). The UMAP of the nodule nuclei revealed 11 different cell clusters named A to K (Figure <ref type="figure">2A</ref>). Clusters F and G are characterized by significantly higher numbers of UMIs and expressed genes per nucleus compared with the other clusters, suggesting the higher transcriptomic activity of cells in these two clusters (Figure <ref type="figure">2B</ref>). The transcriptomic coverage of the nodule clusters varies from 34.4% (i.e., cluster D, which contains only 21 nuclei) to 98. <ref type="bibr">9% (i.e., clusters F and G)</ref>. Except for clusters D and C (79.1% coverage, 115 nuclei), the transcriptomic coverage of the remaining nodule clusters is greater than 85% (Supplemental Table <ref type="table">2</ref>). Using the same parameters described above, we identified 250 nodule-cluster marker genes and 950 ubiquitously expressed genes across the 11 nodule clusters, but only 16 constitutively expressed genes (Supplemental Table <ref type="table">1</ref>). Among the 16 constitutively expressed genes of the nodule, 8 genes encoding proteins with fundamental biological functions were also identified as constitutively expressed in the soybean root system (i.e., Glyma.17G073300 [SRPR protein that supports protein translation at the endoplasmic reticulum], Glyma.19G248000 [ankyrin repeat-containing protein], Glyma.09G156600 [small subunit ribosomal protein S3e], Glyma.07G091800, Glyma.18G222400 [spastin, a microtubulesevering protein], Glyma.12G056500 [AN1-type zinc finger protein], Glyma.13G212500 [proline-rich nuclear receptor coactivator], and Glyma.08G262900 [UV excision repair protein RAD23]). The low number of ubiquitously and constitutively expressed genes in the nodule results from the lower transcriptomic depth of cluster D, a consequence of the very small population of (C) Dotplot representation of the expression of 34 cell-type-specific marker genes of the soybean nodule validated using M.C. technology (Supplemental Figures <ref type="figure">6</ref> and <ref type="figure">7</ref>). (D) Dotplot representation of the expression of six nodule cluster I-specific marker genes. The dot sizes in (C) and (D) represent the percentage of cells in which each gene is expressed. For these two dotplot figures, the percentage of nuclei expressing the gene of interest (circle size) and the mean expression (circle color) of the gene are shown. (E-H) Integrated analysis of the expression of several soybean nodule marker genes using M.C. technology on a soybean nodule cross-section. (E) Detection of transcripts from the inner/outer cortical cells (blue) and the sclereid layer (pink). (F) Detection of transcripts from the vascular endodermis (light pink) and vascular bundle (orange). (G) Detection of B. diazoefficiens transcripts from nine different genes in infected nodule cells (yellow). (H) Detection of plant transcripts in B. diazoefficiens-infected (red) and uninfected cells (green; green arrows) (see methods for details). (I) Identification of the population of 968 rhizobia-infected (yellow circles) and 1769 uninfected cells (black stars) of the nodule through a principalcomponent analysis (PCA) plot of nodule cells analyzed by M.C. technology. To generate these plots, the transcript numbers of the 10 B. diazoefficiens genes were taken into consideration. (J) Violin plots of the density of the number of 10 different bacterial (left) and Glyma.17G195900 (right) transcripts in the population of 968 infected (yellow) and 1769 uninfected (gray) cells of the soybean nodule. A two-tailed Student's t-test of 0 supports the significant difference in expression of 10 bacterial genes and Glyma.17G195900 between B. diazoefficiens-infected and uninfected cells. (A) UMAP projection and integration of soybean nodule transcriptomes at single-cell resolution for nodules at 12 dpi <ref type="bibr">(Liu et al., 2023)</ref>, 14 dpi <ref type="bibr">(Sun et al., 2023)</ref>, 21 dpi <ref type="bibr">(Liu et al., 2023)</ref>, and 28 dpi (this study). The 17 clusters of this integrated UMAP (I to XVII) were functionally annotated on the basis of the (legend continued on next page) nuclei in this cluster (i.e., 21 nuclei). Therefore, we estimate that the 16 constitutively expressed nodule genes identified here are highly and stably expressed in all cell types of the nodule.</p><p>To annotate the 11 clusters, we applied M.C. technology to a cross-section of a mature nodule. After analysis of the spatial activity of 28 soybean genes (Figure <ref type="figure">2C</ref> and Supplemental Figure <ref type="figure">6</ref>), we annotated clusters A and B as the ''inner/outer cortical cell'' cluster (Figure <ref type="figure">2E</ref>, blue), cluster C as the ''vascular endodermis'' cluster (Figure <ref type="figure">2F</ref>, light pink), cluster D as the ''sclereid layer'' cluster (Figure <ref type="figure">2E</ref>, pink), and cluster E as the ''vascular bundle <ref type="bibr">'' cluster (Figure 2F,</ref><ref type="bibr">orange)</ref>. As with the root phloem cells, we assume that the very limited number of nuclei in cluster D and its low transcriptomic coverage result from enucleation of the cells in the sclereid layer of the nodule. Using M.C. technology with probes designed against nine B. diazoefficiens genes, we identified both infected cells (Figure <ref type="figure">2G</ref> and Supplemental Figure <ref type="figure">7</ref>) and uninfected cells in the infection zone of the nodule (Figure <ref type="figure">2G</ref>, white arrow). Our M.C. experiments also revealed the spatial activity of two soybean genes expressed in the uninfected cells (Glyma.06G235500 [tyrosine aminotransferase] and Glyma.06G002000 [encoding an MLO protein, an inhibitor of plant defense responses; <ref type="bibr">(Colebatch et al., 2004)</ref>]) (Figure <ref type="figure">2H</ref>, green arrows, and Supplemental Figure <ref type="figure">8</ref>) and four soybean genes expressed in the rhizobia-infected nodule cells (Glyma.17G195900 [casein kinase], Glyma.01G164600 [L-ascorbate peroxidase], Glyma.15G210100 [trehalose-6-phosphate synthase, which controls the biosynthesis and accumulation of trehalose, a carbon source for the symbiont], and Glyma.05G216000) (Figure <ref type="figure">2H</ref>, red, and Supplemental Figure <ref type="figure">8</ref>). Owing to its high transcriptional activity, we found that Glyma.17G195900 is an excellent marker for rhizobia-infected nodule cells <ref type="bibr">(Figure 2I and 2J)</ref>. Therefore, we named this gene RIM (i.e., Rhizobia-Infected Marker gene). The preferential activity of Glyma.06G235500 and Glyma.06G002000 in clusters J and K, in conjunction with the preferential expression of RIM, Glyma.01G164600, and Glyma.15G210100 in clusters F and G and Glyma.05G216000 in cluster H (Figure <ref type="figure">2C</ref>), led to the annotation of clusters J and K as ''uninfected cells'' and clusters F, G, and H as ''infected cells'' of the nodule. To further support these annotations, we analyzed the expression profiles of soybean genes that control the biosynthesis of purines and ureides (Supplemental Figure <ref type="figure">9A</ref>). We observed that rhizobiainfected cells of clusters F and G, and to a lesser extent cluster H, strongly express genes that control purine biosynthesis, whereas a ureide permease is most highly expressed in uninfected cells of the infection zone in clusters J and K (i.e., Glyma.02g116300). Our results nicely support the known compartmentation of the ureide biosynthesis pathway between infected and uninfected nodule cells <ref type="bibr">(Bergersen, 1965;</ref><ref type="bibr">Ohyama and Kumazawa, 1978;</ref><ref type="bibr">1979;</ref><ref type="bibr">Kouchi et al., 1988</ref><ref type="bibr">Kouchi et al., , 1990;;</ref><ref type="bibr">Tajima et al., 2004)</ref>. The functional annotation of clusters F, G, and H was further confirmed on the basis of expression of soybean genes orthologous to M. truncatula transcription factor (TF) genes with regulatory roles in the nodulation process (Supplemental Figure <ref type="figure">9B</ref>; Supplemental Table <ref type="table">4</ref>).</p><p>To annotate cluster I, we analyzed the biological functions of genes specifically and preferentially expressed in this cluster. Interestingly, we found that expression of NAC039 (Glyma.06G157400) and NAC018 (Glyma.04G208300) genes, marker genes of nodule senescence <ref type="bibr">(Yu et al., 2023)</ref>, was induced in cluster I (Supplemental Figure <ref type="figure">10A</ref>). We also identified several genes strongly and almost exclusively expressed in cluster I that are involved in oxidative stress response and nodule senescence. These included genes encoding a metallothionein (Fonseca-Garc&#305; &#180;a et al., 2022) (Glyma.18G180800), two CAP (Cysteine-rich secretory proteins Antigen 5 and Pathogenesis-related 1) proteins, proteins associated with soybean nodulation-related traits <ref type="bibr">(Zhu et al., 2019)</ref> (Glyma.15G062300 and Glyma.13G252600), a g-thionin antimicrobial protein <ref type="bibr">(Zasloff, 2002)</ref> (Glyma.16G100400), and two TCTPs (translationally controlled tumor proteins), which hamper programmed cell death <ref type="bibr">(Kiirika et al., 2014)</ref> (Glyma.09G044200 and Glyma.15G148900) (Figure <ref type="figure">2D</ref> and Supplemental Figure <ref type="figure">10B</ref>). Taken together, our data suggest that cluster I is composed of nuclei isolated from senescing nodule cells. This assumption is further supported by the decreased activity of leghemoglobin genes (Figure <ref type="figure">4D</ref>). Considering the nodule UMAP, we functionally annotated the 11 cell clusters (Figure <ref type="figure">2A</ref>), including clusters F, G, H, and I that are associated with infected nodule cells actively fixing atmospheric dinitrogen and those engaged in senescence.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Single-cell-type transcriptomic changes during nodule development</head><p>Recent studies have reported the transcriptome profiles of developing soybean nodules (i.e., 12, 14, and 21 dpi) at the single-cell level <ref type="bibr">(Liu et al., 2023;</ref><ref type="bibr">Sun et al., 2023)</ref>. To examine changes in the cellular complexity and transcriptomic profiles of soybean cells during nodule development, we processed (Supplemental Figure <ref type="figure">11</ref>) and integrated into a single UMAP the transcriptomes of single nuclei isolated from each of the four developmental time points (Figure <ref type="figure">3A</ref>). The 9248, 22 332, 14 000, and 7830 nuclei isolated from nodules at 12, 14, 21, and 28 dpi, respectively, were clustered into 17 groups in an integrated nodule UMAP (Figure <ref type="figure">3A</ref> and <ref type="figure">3B</ref>).</p><p>Hypothesizing that the expression of nodule cell-type marker genes should be conserved at least to some extent during nodule development, we looked at the expression patterns of 51 marker genes isolated from each of the eleven 28-dpi nodule clusters expression of 51 cell-type marker genes identified from the 28-dpi sNucRNA-seq datasets (i.e., p % 0.01; expression in R25% of the nuclei in the cluster under consideration; Supplemental Figure <ref type="figure">12</ref>). (B) Split UMAPs and distribution of the number of nuclei per cluster in percentages for each developmental stage of the nodule (i.e., 12, 14, 21, and 28 dpi). (C) Dotplot representation of the expression of the 51 nodule cell-type marker genes (Supplemental Figure <ref type="figure">12</ref>) at 4 developmental stages of the soybean nodule. The percentage of nuclei in the cluster expressing the gene of interest (circle diameter) and the mean of gene expression (circle color) are both displayed. (D) Principal-component analysis of the transcriptomes of the 17 clusters of the integrated soybean nodule UMAP and for each nodule developmental stage (12-, 14-, 21-, and 28-dpi). The transcriptomes of the cells infected by rhizobia (clusters XIII and XIV) are specifically labeled.</p><p>Single-cell RNA-seq atlas of soybean root organs (legend continued on next page) <ref type="bibr">(Figure 2A and Supplemental Figure 12)</ref> to functionally annotate the 17 clusters of the integrated nodule UMAP (Figure <ref type="figure">3C</ref>). This approach revealed co-expression of nodule-cell-type marker genes for many clusters (i.e., for the vascular endodermis [cluster IX], the sclereid layer [cluster X], the vascular bundle [cluster XII], the infected cells [cluster XIII, which is enriched in non-nitrogenfixing cells, and cluster XIV which is enriched in nitrogen-fixing and senescing cells] [Supplemental Figure <ref type="figure">13</ref>], and the uninfected cells in the infection zone of the nodule [i.e., clusters XVI and XVII, clusters J and K in the 28-dpi UMAP]) (Figure <ref type="figure">3C</ref>). On the other hand, clusters II to VIII share expression of marker genes of the inner and outer cortical nodule cells (Figure <ref type="figure">3C</ref>), suggesting more plasticity in the transcriptomic profile of the nodule cortical cells in and between the developmental stages of the nodule.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Plant Communications</head><p>To estimate the levels of conservation of the transcriptomes for each cell type during nodule development, we performed a principal-component analysis. Interestingly, this analysis revealed limited changes in the transcriptomes of cells infected by rhizobia (i.e., cluster XIII and XIV) during nodule development at 12-, 14-, and 21-dpi (Figure <ref type="figure">3D</ref>). Whereas PCA1 variance was driven by transcriptomic changes during nodule development, notably from 28-dpi nodules, PCA2 variance reflected transcriptomic differences between different cell types of the nodule. Interestingly, when exclusively considering the PCA1 dimension, we noted the absence of significant transcriptomic variance between the clusters representing the 12-, 14-, and 21-dpi nodules compared with the 28-dpi nodule clusters (Figure <ref type="figure">3D</ref>). This result suggests that only minor transcriptomic changes occurred in the nodule cell clusters from 12 to 21 dpi. However, at 28 dpi, we observed significant transcriptomic changes for each cell type of the soybean nodule, especially in the nitrogen-fixing and rhizobiainfected cells of cluster XIV, likely reflecting the physiological changes that occur at this developmental stage. Taken together, our analyses reveal similar transcriptomic profiles of 12-, 14-, and 21-dpi nodule cell types but drastic and global transcriptomic changes in 28-dpi nodules, supporting the focus of our study on 28-dpi soybean nodules to better capture the transcriptomic changes that occur in different populations of rhizobia-infected cells. This conclusion is further strengthened by the fact that 71% and 59% of the 1985 and 1359 nuclei of rhizobia-infected cell clusters XIII and XIV were identified from the 28-dpi nodule, respectively, and that over 51% of the cells expressing at least one of the five soybean leghemoglobin genes, and, a fortiori, actively fixing atmospheric nitrogen, were isolated from the 28dpi nodule. Accordingly, we performed a more comprehensive analysis of the transcriptome of infected nodule cells isolated from 28-dpi nodules.</p><p>The infection zone of the soybean nodule is composed of different populations of rhizobia-infected cells Our single-cell transcriptomic analysis revealed differences and similarities in the transcriptional profiles of different cell types in the root and nodule organs (Figures <ref type="figure">1A</ref> and <ref type="figure">2A</ref>). Analysis of a multidimensional scaling (MDS) plot revealed that most of the 28-dpi nodule cell clusters differ transcriptionally from root cell clusters (Figure <ref type="figure">4A</ref>), confirming the unique functions of the different cell types that comprise the nodule organ. The infected cell clusters F and G, as well as cells of the sclereid layer (cluster D), have the most unique transcriptomic profiles. Uninfected nodule cells (clusters J and K), another subpopulation of infected cells (cluster H), and senescing nodule cells (i.e., cluster I) are also characterized by specific transcriptomic profiles. This MDS analysis also demonstrated that the infected cells of clusters F and G differ transcriptionally from the infected cells of cluster H (Figure <ref type="figure">4A</ref>).</p><p>To further characterize these differences between infected nodule cells, we performed a comparative transcriptomic analysis of clusters F, G, and H (Figure <ref type="figure">4B</ref>). We identified only 253 differentially expressed genes (DEGs) between clusters F and G (Figure <ref type="figure">4B</ref>; Supplemental Table <ref type="table">5</ref>), confirming their transcriptional similarity. Among these DEGs, we identified several upregulated genes in cluster G that control the catabolism and metabolism of cyclic bglucans, bacterial carbohydrates that function in suppression of host defense, regulation of osmotic potential, and interaction with host membranes <ref type="bibr">(Bhagwat et al., 1999;</ref><ref type="bibr">Poole and Ledermann, 2022</ref>) (Supplemental Table <ref type="table">6</ref>). When comparing the activity of soybean genes between clusters F and G vs. H, we identified 3731 upregulated genes in clusters F and G (Supplemental Table <ref type="table">5</ref>) associated with ''RNA processing,'' ''vesicle-mediated transport,'' and ''chromatin modification'' functions (Supplemental Table <ref type="table">6</ref>), including genes orthologous to MtCCS52A, which controls the endoreduplication rate of infected cells during rhizobial differentiation <ref type="bibr">(Cebolla et al., 1999;</ref><ref type="bibr">Vinardell et al., 2003)</ref> (Figure <ref type="figure">4C</ref> and Supplemental Figure <ref type="figure">14</ref> for details; Supplemental Table <ref type="table">7</ref>). In cluster H, we identified 331 upregulated genes (Supplemental Table <ref type="table">5</ref>) associated with the ''nodulation'' process and ''biosynthesis of (C-G) Ridge plot distributions of the expression of soybean CCS52A (C) and leghemoglobin genes (D), as well as genes that control host-range restriction (E), nodule defense (F), and bacterial maturation (G) (x axis) as defined by <ref type="bibr">Roy et al. (2020)</ref> (Supplemental Table <ref type="table">4</ref>). The number of cells expressing the gene(s) in each cluster is represented on the y axis. (H) UMAP plot of 4368 Medicago nodule nuclei based on their transcriptomic profiles. The raw single-cell RNA-seq datasets were obtained from <ref type="bibr">Ye et al. (2022)</ref> and reprocessed before generating the UMAP (see methods). These nuclei were clustered into 8 different groups.</p><p>(I-M) Ridge plot distributions of the expression of Medicago CCS52A (I) and leghemoglobin genes (J), as well as genes that control host-range restriction (K), nodule defense (L), and bacterial maturation (M) (x axis) as defined by <ref type="bibr">Roy et al. (2020)</ref> (Supplemental Table <ref type="table">4</ref>). The number of cells expressing the gene(s) in each cluster is represented on the y axis.</p><p>(N) M.C. images of the expression of Glyma.05G203100 (green) and Glyma.17G195900 (red) genes used as markers of rhizobia-infected nodule cells. Glyma.05G203100 transcripts were specifically detected in a subset of infected nodule cells (right panel), in the cells of the vascular bundle (top-left panel), and in a sub-population of cortical cells (bottom-left panel).</p><p>(O) Dotplot representations of the expression of Glyma.05G203100. The percentage of nuclei in the cluster expressing the gene of interest (circle diameter) and the mean of gene expression (circle color) are both displayed. (P) Distribution of the nuclear area of rhizobia-infected cells of clusters F and G and cluster H and uninfected nodule cells. A two-tailed Student's t-test was used to estimate the significance of differences in nuclear area among these clusters. organonitrogen compounds'' (Supplemental Table <ref type="table">6</ref>). This result suggests that cluster H cells are involved in an active nitrogen fixation process. This conclusion is further confirmed by the transcriptional activity of four soybean leghemoglobin genes in cluster H (Figure <ref type="figure">4D</ref> and Supplemental Figure <ref type="figure">14</ref> for details; Supplemental Table <ref type="table">7</ref>) and the differential activity of soybean orthologs of legume genes that control various aspects of the legume-rhizobia symbiosis. For instance, clusters F and G are characterized by the preferential expression of genes that control host-range restriction and the plant defense system as defined previously <ref type="bibr">(Roy et al., 2020)</ref> (Figure <ref type="figure">4E</ref> and 4F and Supplemental Figure <ref type="figure">14</ref> for details; Supplemental Table <ref type="table">S7</ref>). Interestingly, genes that control bacterial maturation were most highly expressed in cluster G and, to a lesser extent, in cluster H (Figure <ref type="figure">4G</ref> and Supplemental Figure <ref type="figure">14</ref> for details; Supplemental Table <ref type="table">S7</ref>).</p><p>We hypothesized that the rhizobia-infected cells of clusters F and G and cluster H in the soybean nodule are physiologically and biologically similar to those of the infection zone and nitrogen fixation zone in the M. truncatula nodule (i.e., zones II and III, respectively). To verify this hypothesis, we analyzed the expression patterns of known nodulation-related genes in M. truncatula (Supplemental Table <ref type="table">S8</ref>). Accordingly, we first reanalyzed published single-cell RNA-seq datasets from the M. truncatula nodule (see methods for details; Figure <ref type="figure">4H</ref> and Supplemental Figure <ref type="figure">15A</ref>). We annotated cluster 4 and clusters 6 and 7 as the infection zone (i.e., zone II) and the nitrogen fixation zone (zone III) of the M. truncatula nodule on the basis of the expression of selected marker genes <ref type="bibr">(Ye et al., 2022)</ref> (Figure <ref type="figure">4H</ref> and Supplemental Figure <ref type="figure">15A</ref>; Supplemental Table <ref type="table">8</ref>). Cluster 5 seems to be composed of cells in a transitional stage between cluster 4 and clusters 6 and 7, as indicated by the expression of nodulation marker genes (Supplemental Figure <ref type="figure">15A</ref>). We examined the transcriptional activity of M. truncatula CCS52A, leghemoglobin genes, and genes characterized as regulators of host range, defense, and bacterial maturation responses <ref type="bibr">(Roy et al., 2020)</ref> and found that MtCCS52A was most broadly and highly expressed in cluster 4 and was expressed at a lower level in cluster 5 (Figure <ref type="figure">4I</ref> and Supplemental Figure <ref type="figure">15B</ref> for details; Supplemental Table <ref type="table">S7</ref>). On the other hand, leghemoglobin genes were most highly expressed in nitrogen-fixing clusters 6 and 7 (Figure <ref type="figure">4J</ref> and Supplemental Figure <ref type="figure">15B</ref> for details; Supplemental Table <ref type="table">S7</ref>). We also observed the preferential activity of host-range restriction and nodule defense genes in cluster 5 and found that genes that control bacterial maturation were preferentially expressed in cluster 6 and, to a lesser extent, in clusters 5 and 7 (Figure <ref type="figure">4K</ref>-4M and Supplemental Figure <ref type="figure">15B</ref> for details; Supplemental Table <ref type="table">7</ref>). The similar single-cell-type expression profiles of nodulation-related genes in nodule clusters of soybean (Figure <ref type="figure">4C-4G</ref>) and M. truncatula (Figure <ref type="figure">4I-4M</ref>) suggests that the sub-populations of clusters F and G and cluster H in soybean have similar biological functions to those described previously for the infection zone II (clusters 4 and 5) and the nitrogen fixation zone III of the M. truncatula nodule (clusters 6 and 7), respectively. Our single-cell transcriptomic analysis thus redefines current knowledge of cellular diversity in the population of infected cells in the mature soybean nodule.</p><p>Our results suggest that at least two distinct populations of cells are present in the infected zone of the mature soybean nodule: the infected but non-nitrogen-fixing cells of clusters F and G and the infected nitrogen-fixing cells of cluster H. However, our single-cell transcriptomic datasets do not enable us to determine whether these populations co-exist in the same nodule. To test this possibility, we looked for molecular markers of subpopulations of rhizobia-infected cells. Upon careful review of the spatial activity of soybean genes using M.C. technology, we determined that Glyma.05G203100, which encodes a DNA-repair DEK protein, is expressed in a subset of rhizobia-infected cells (Figure <ref type="figure">4N</ref>). According to the 28-dpi UMAP, Glyma. 05G203100 is most highly expressed in the nitrogen-fixing cluster H and, to a lesser extent, in the cortical and vascular bundle cells of the nodule (i.e., clusters A and D, respectively; Figure <ref type="figure">4O</ref>). The sNucRNA-seq transcriptional profile of Glyma.05G203100 was perfectly confirmed by M.C. (Figure <ref type="figure">4N</ref>). Confirmation of the expression pattern of Glyma.05G203100, a gene that shows relatively low expression in soybean nodule cells, by two independent technologies further supports the biological relevance of the soybean root and nodule sNucRNA-seq transcriptomes. Our results also support the existence of different sub-populations of rhizobia-infected soybean cells that differ in the differential expression of thousands of genes (Figure <ref type="figure">4B</ref>), including Glyma.05G203100 (Figure <ref type="figure">4N</ref> and 4O), leading to active endoreduplication and transcriptional activity in the cells of clusters F and G and active fixation of atmospheric nitrogen in the cells of cluster H. To gain further insight into their biology and confirm that the cells of clusters F and G are endoreduplicated, as suggested by GmCCS52a expression (Figure <ref type="figure">4C</ref>), we analyzed the nuclear area of the non-nitrogenfixing cells of clusters F and G, the infected nitrogen-fixing cells of cluster H, and the uninfected nodule cells by taking advantage of the DAPI staining of the nuclei used in M.C. Our results revealed that the nuclei of cells from clusters F and G and cluster H were significantly larger than those of uninfected nodule cells, reflecting their endoreduplication (Figure <ref type="figure">4P</ref>). We assume that the cells of clusters F and G must enter cycles of endoreduplication prior to becoming active nitrogen-fixing cluster-H cells. Taken together, our sNucRNA-seq and M.C. results support previously published spatial metabolomics observations that revealed the biochemical heterogeneity of the nodule infection zone <ref type="bibr">(Veli ckovi c et al., 2018;</ref><ref type="bibr">Agtuca et al., 2020)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Gene regulatory network analysis reveals intra-cluster heterogeneity within rhizobia-infected nodule cells</head><p>To further characterize the molecular heterogeneity between cell populations of the nodule infection zone and reveal the principal regulatory pathways that govern the biology of these cells, we inferred gene regulatory networks (GRNs) for clusters F and G and cluster H. Leveraging sNucRNA-seq data from 28-dpi nodules, we used the 2000 most differentially expressed genes between clusters F and G and cluster H, known SNF genes, and predicted soybean TF genes to infer GRNs (Figure <ref type="figure">5</ref>). We then identified the TFs with the largest number of targets (21 for the cluster H network, as hubs 20 and 21 had the same number of targets; Supplemental Table <ref type="table">S9</ref>) as the top 20 regulators for each network and visualized them alongside known SNFrelated genes (Figure <ref type="figure">5A</ref> and 5B and Supplemental Figure <ref type="figure">16</ref>) <ref type="bibr">(Roy et al., 2020)</ref>. Comparison of these two lists of regulators Plant Communications 5, 100984, August 12 2024 &#170; 2024 The Author(s).</p><p>revealed only three shared TFs (Supplemental Table <ref type="table">S9</ref>, green cells), further supporting the functional divergence between cells of clusters F and G and cells of cluster H in infected nodules (Figure <ref type="figure">4B</ref>). Notably, a larger number of target genes were under the control of the top 20 TF regulators in clusters F and G (median of 234.5 targets) than in cluster H (median of 147 targets), further demonstrating the higher transcriptional activity occurring in the infected cells of clusters F and G (Figure <ref type="figure">2B</ref>), likely through an increased involvement of TFs or transcriptional-level control. This inference drawn from contrasting target numbers is also supported by the most significantly enriched gene ontology term for clusters F and G compared with cluster H, namely ''RNA processing'' (Figure <ref type="figure">4B</ref>).</p><p>In the cluster F and G network, the top 20 regulators form a prominent central cluster (Figure <ref type="figure">5A</ref>, orange and blue nodes), in contrast to the cluster H network, in which the 21 regulators are split into two major and distinct groups (Figure <ref type="figure">5B</ref>, orange and blue nodes). This observation implies functional homogeneity among cells of clusters F and G but within-cluster heterogeneity among cells of cluster H. Notably, the F and G network exhibits an overrepresentation of NIN TFs, with 6 NIN genes identified among the top 100 regulators (Figure <ref type="figure">5A</ref>, large nodes), three of whichincluding an LjNIN ortholog (Figure <ref type="figure">5A</ref>, large blue node)-are among the top 20 regulators (Figure <ref type="figure">5A</ref>, large orange and blue nodes). Conversely, no NIN TFs are found among the top 100 regulators in the cluster H network (Figure <ref type="figure">5B</ref>). Prior investigations have revealed the role of the LjNIN protein as a primary regulator of LjNF-YA1 <ref type="bibr">(Soyano et al., 2013;</ref><ref type="bibr">Laffont et al., 2020)</ref>. Besides the three NINs among the top 20 regulators in clusters F and G, our network analysis also identified NF-YA10 (Glyma.10G082800) in this network (Figure <ref type="figure">5A</ref>). Consistent with its role during the nodulation process, NF-YA10 shows the highest expression in nodules, followed by soybean root hairs <ref type="bibr">(Yu et al., 2020)</ref>. Taken together, the presence of GmNF-YA10 and three NIN TFs among the top regulators of the F and G network suggests potential regulatory interactions among these TFs in cluster F and G cells. Although NF-YA10 is also among the top regulators of the cluster H network (Figure <ref type="figure">5B</ref>), lack of a central role for NIN TFs in network H likely implies a different regulatory pathway for NF-YA10 in F and G vs. H.</p><p>Another unique feature in the F and G network is the significant disparity between the number of target genes of the first (Glyma.07G128700; 1409 targets) and second regulators (Glyma.10G081700, 437 targets). This gap suggests a predominant and pivotal regulatory role for Glyma.07G128700 in cluster F and G cells. This gene encodes an Effector of Transcription 2 (ET2) protein that has been reported to regulate the expression of several KNAT TFs to control cell differentiation through regulation of the cell cycle <ref type="bibr">(Ivanov et al., 2008)</ref>. ET2 is also associated with DNA methylation and repair processes <ref type="bibr">(Tedeschi et al., 2019)</ref>, aligning well with another enriched gene ontology term, ''chromatin modification,'' in clusters F and G (Figure <ref type="figure">4B</ref>). Interestingly, we identified the KNOX family member Glyma.17G104800 among the top 20 regulators in the cluster F and G network. Glyma.17G104800 is orthologous to MtKNAT9, a known SNF-related TF in Medicago <ref type="bibr">(Di Giacomo et al., 2017)</ref>. Although Glyma.17G104800 also features among the top regulators in the cluster H network, it has a significantly lower number of targets (Supplemental Table <ref type="table">9</ref>), underscoring its more central role in clusters F and G than in cluster H. Supporting the central role of Glyma.17G104800 in clusters F and G, we identified three BEL1-like homeodomain 1/2 TF genes, which are homologous to Arabidopsis BLH1 or BLH2. In Arabidopsis, BLH1 works with AtKNAT3 to control organ development <ref type="bibr">(Kim et al., 2013)</ref>, and BLH2 affects the expression of multiple KNOX TFs <ref type="bibr">(Kumar et al., 2007;</ref><ref type="bibr">Jeon and Byrne, 2020)</ref>. Together, our data suggest that the identified NIN/NF-YA and BLH/KNAT regulons play a central role in the control of rhizobial infection processes in the F and G cell clusters of the soybean nodule. We hypothesize that, among these TF genes, Glyma.07G128700 plays a central role in regulating the NIN/NF-YA/KNAT/BLH network, either directly or indirectly, and potentially through chromatin modification and DNA methylation.</p><p>The topography of the cluster H network differs from that of the F and G network, in that the top regulators are split into two groups (Figure <ref type="figure">5B</ref>). Upon examining the functional categorization of targets in the major ''NIN'' regulatory group of the F and G network (Figure <ref type="figure">5A</ref>) and the two distinct regulatory groups of the H network (Figure <ref type="figure">5B</ref>), we observed that the ''autoregulation of nodule number'' category (i.e., orthologs of LjCLE-RS2, LjLSK1, MtCPK3, and MtRDN1, according to <ref type="bibr">Roy et al., 2020)</ref> was overrepresented in one group of the H network (highlighted in gray in Figure <ref type="figure">5B</ref>). In this same highlighted group, 12 genes were associated with plant stress response and the cell death program. These included GmNAC012/ NAC021/NAC036, which were annotated as Arabidopsis NTL9 (NAC Transcription factor-Like 9), a calmodulin-regulated transcriptional repressor linked with regulation of leaf senescence and defense response <ref type="bibr">(Yoon et al., 2008;</ref><ref type="bibr">Block et al., 2014)</ref>; Glyma.15G192000, a homolog of AtLSD, which controls the plant cell death pathway and immune response <ref type="bibr">(M&#8364; uhlenbock et al., 2008)</ref>; and GmNAC114, a homolog of AtSND2, which regulates cell wall organization <ref type="bibr">(Hussey et al., 2011)</ref> and is intricately linked with cell growth and death in various organs <ref type="bibr">(Dauphin et al., 2022)</ref>. Besides the four NAC TFs mentioned above (i.e., GmNAC012/021/036/114), GmNAC144 and GmNAC025 are homologous to AtSOG1, a TF that controls endocycling and the crosstalk between immune response and DNA damage <ref type="bibr">(Adachi et al., 2011;</ref><ref type="bibr">Yoshiyama et al., 2020)</ref>. This overrepresentation of NAC genes in one group of the H network (highlighted in gray in Figure <ref type="figure">5B</ref>) is supported by previous studies showing the roles of several legume NAC genes in nodule senescence. For instance, MtNAC969, which is phylogenetically close to Arabidopsis ANAC092 <ref type="bibr">(Wang et al., 2023b)</ref>, implicated in leaf senescence <ref type="bibr">(Guo et al., 2021)</ref>, is a regulator of nodule senescence in Medicago (de Ze &#180;licourt et al., 2012). Recently, LjNAC094 was also reported to regulate nodule senescence in (B) Simplified visualization of the inferred gene regulatory network for nodule cell cluster H. This network shows only interactions that involve the top 21 regulators (orange nodes, or blue if they are also SNF related) of this network and/or known SNF-related genes (green nodes, or blue if they are also among the top 21 regulators). In this network, a cluster of 12 regulators at the bottom of the network, including the top eight hubs of the list (see Supplemental Table <ref type="table">9</ref>), is highlighted in gray. Genes referenced in the main text and blue nodes are labeled by gene names and the remaining nodes by their soybean gene IDs. L. japonicus <ref type="bibr">(Wang et al., 2023b)</ref>. Therefore, these six NAC TFs are potential candidates for the regulation of nodule senescence in soybean. Finally, a few other TFs in this highlighted group have Arabidopsis orthologs that are associated with defense/immunity response (AtHSFB1 and AtWRKY40) or the cell cycle (AtALY3). These findings, together with the close transcriptional relationship between clusters H and I (senescing cells; Figure <ref type="figure">2A</ref>) support the hypothesis that cluster H cells, at least to some extent, are involved in autoregulation of nodule number via defense-response signaling, as well as regulatory pathways associated with the cell wall, the cell cycle/endocycling, and cell death.</p><p>Overall, this GRN analysis underscores the diversity within and between clusters of infected nodule cells, shedding light on novel candidates potentially implicated in various signaling/regulatory processes in soybean nodules, particularly in cells of the infection zone.</p><p>GmFWL3, like GmFWL1, controls infection of soybean nodule cells by B. diazoefficiens</p><p>GmFWL1, a member of the soybean plasma membrane microdomain-associated FWL family, has been identified as a microdomain-associated protein that controls chromatin accessibility and the infection of nodule cells by B. diazoefficiens <ref type="bibr">(Libault et al., 2010b;</ref><ref type="bibr">Qiao et al., 2017a)</ref>. Our single-cell RNAseq approach enabled us to revisit the expression pattern of GmFWL1 in mature soybean nodules, revealing its preferential expression in cells of rhizobia-infected clusters F, G, and H (Figure <ref type="figure">6A</ref> and <ref type="figure">6B</ref>). Given the over-representation of GO terms related to ''RNA processing'' and ''chromatin condensation'' in the pool of cluster F and G DEGs (Figure <ref type="figure">4C</ref>), our results suggest that the microdomain fraction of the plant plasma membrane controls the infection of nodule cells by rhizobia, potentially through changes in chromatin condensation <ref type="bibr">(Libault et al., 2010b)</ref>.</p><p>GmFWL3 is the only other member of the FWL family that shares a pattern of expression with GmFWL1 (i.e., GmFWL3 is not ex-pressed in the root [Figure <ref type="figure">6A</ref>] but is preferentially expressed in infected nodule cells) (Figure <ref type="figure">6B</ref>; clusters F, G, and H). Although MtFWL7, the M. truncatula ortholog of GmFWL1, was not expressed in infected nodule cells, MtFWL2, the M. truncatula ortholog of GmFWL3, was most highly expressed in Medicago nodule clusters 4 and 5, which were annotated as S. meliloti-infected and nitrogen-fixing cells (Figure <ref type="figure">6C</ref>). On the basis of these findings, we hypothesize that GmFWL3 encodes another microdomain-associated protein that regulates rhizobial infection of soybean cells. This hypothesis is supported by previous studies in which GmFWL1 and GmFWL3 proteins were localized to the symbiosome membrane <ref type="bibr">(Clarke et al., 2015;</ref><ref type="bibr">Qiao et al., 2017a)</ref>. To verify this hypothesis, we deleted the conserved PLAC8 domain of GmFWL3 by expressing two guide RNAs (i.e., GmFWL3T1 and T2) using CRISPR-Cas-mediated genome-editing technology (Supplemental Figure <ref type="figure">17</ref>). Nodule number was significantly lower on GFP-positive CAS9/ GmFWL3T1-T2 transgenic roots than on GFP-positive Cas9/ pAH595 control transgenic roots (Figure <ref type="figure">6D</ref> and 6E; Student's t-test, p = 0.009). At the cellular level, staining of bacteroids with green-fluorescent SYTO13 dye revealed significantly lower intensity of the fluorescent signal in CAS9/GmFWL3T1-T2 transgenic roots than in control roots (Figure <ref type="figure">6F</ref> and 6G; Student's ttest, p = 0.009). We concluded that microbial infection of nodule cells was impaired upon mutagenesis of GmFWL3.</p><p>To support the role of GmFWL3 as a microdomain-associated protein, we analyzed its cellular and subcellular localization by expressing N-and C-terminal translational fusions between GFP and GmFWL3 in tobacco leaf cells. Both N-and C-terminal GFP-GmFWL3 chimeric proteins were localized in puncta at the plasma membrane (Figure <ref type="figure">6H</ref>-6J, gray arrows, and Supplemental Figure <ref type="figure">18A-18I</ref>). Plasmolysis assays (Supplemental Figure <ref type="figure">18J-18R</ref>) and protoplast isolation experiments (Supplemental Figure <ref type="figure">18S</ref>-18U) provided further evidence of the punctate localization of GmFWL3-GFP at the plasma membrane. Given its punctate localization similar to that of GmFWL1 <ref type="bibr">(Libault et al., 2010b)</ref>, GmFWL3 is likely a plasma membrane microdomain-associated protein. However, in contrast to GmFWL1, which was exclusively localized at the plasma membrane, GmFWL3-GFP was also associated with the nucleus and the nuclear membrane (Figure <ref type="figure">6H</ref>, white arrows). This nuclear membrane localization was confirmed by colocalization of GmFWL3-GFP with the nuclear envelope marker CFP-AtSUN1 <ref type="bibr">(Graumann et al., 2010)</ref> in tobacco epidermal leaf cells  and by epifluorescent confocal microscopy in soybean root epidermal cells and root hairs . To further assess the subcellular localization of GmFWL3 in infected cells of the soybean nodule, cell types in which GmFWL3 is most highly expressed, we used high-resolution transmission electron microscopy (TEM) combined with immunogold labeling to examine transgenic soybean nodules expressing N-terminal and C-terminal myc-tagged GmFWL3 proteins under the control of the pFMV promoter. We observed localization of GmFWL3 at the plasma, symbiosome, vacuolar, and vesicular membranes and in the nuclei of infected nodule cells (white arrows, Figure <ref type="figure">6K-6V</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>GmFWL3 and GmFWL1 belong to the same large membrane microdomain protein complexes</head><p>We hypothesized that GmFWL3 is a microdomain-associated protein.</p><p>To verify this hypothesis, we performed coimmunoprecipitation assays on 30-dpi GFP-positive transgenic soybean nodules expressing N-and C-terminal myc-tagged GmFWL3 proteins to identify GmFWL3 interaction partners (Supplemental Figure <ref type="figure">20A</ref>). Across three independent biological replicates, a total of 321 proteins co-immunoprecipitated with N-terminal or C-terminal myc-tagged GmFWL3 proteins in at least two replicates but not with the myc-tag alone (Supplemental Table <ref type="table">10</ref>). Among these GmFWL3 interactors were five FWL/PLAC8 proteins including GmFWL1, six vacuolar ATPases, nine SPFH-domain microdomain-associated proteins (i.e., prohibitin/flotollin/remorin), 12 aquaporins, and 21 proteins with functions related to vesicle trafficking (Figure <ref type="figure">7A</ref>; Supplemental Table <ref type="table">10</ref>). We examined the transcriptional activity of the 321 genes encoding these proteins (Figures 7B) and found that 10 were preferentially expressed in clusters F, G, and H (fold-change R4 between the expression level in the most highly expressed F, G, and H cluster vs. the most highly expressed cluster in the remaining nodule clusters and the root clusters; Figure <ref type="figure">7C</ref>). These genes encode three FWL/PLAC8 proteins (including GmFWL1 and GmFWL3), three prohibitin/flotollin/ remorin proteins, which are well-characterized microdomainassociated proteins, one CASP-like protein, one sulfate transporter, one vesicle-associated membrane protein, and one receptor-like kinase 1. Among these 10 genes, 9 were co-expressed in clusters F and G, and, to a lesser extent, in cluster H (Figure <ref type="figure">7C</ref>, red characters, and Supplemental Figure <ref type="figure">20B</ref>). The co-expression of these genes in infected nodule cells and the interactions of their proteins with GmFWL3 suggest the formation of a quaternary protein structure localized in the microdomain fraction of biological membranes to control the symbiosis between soybean nodule cells and bacteroids. GmFWL1 and GmFWL3 proteins play a central and likely redundant role in formation of this protein complex, as revealed by integration of the lists of proteins co-immunoprecipitated with GmFWL3 and GmFWL1 <ref type="bibr">(Qiao et al., 2017a)</ref>. When comparing the proteins coimmunoprecipitated by the two FWL proteins, we found that 63 proteins that interacted with GmFWL3 also interacted with GmFWL1 <ref type="bibr">(Graumann et al., 2010</ref>) (Supplemental Table <ref type="table">10</ref>).</p><p>These included seven SPFH-domain microdomain-associated proteins, four vacuolar ATPases, 15 proton-ATPase/GTPases, four aquaporins, four receptor kinases, four integral membrane proteins, three proteins with transport activity, and three proteins associated with vesicle trafficking. Interestingly, among these shared binding partners was GmFLOT2/4 (Glyma.06G065600), the soybean ortholog of M. truncatula MtFLOT2 and MtFLOT4. This result shows that GmFWL1 and GmFWL3 proteins cooperate to support a network of microdomain-associated proteins to control infection of soybean nodule cells by B. diazoefficiens.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>DISCUSSION</head><p>The nodule is a root organ specialized for the fixation and assimilation of atmospheric nitrogen. This biological process is the product of the symbiotic interaction between nitrogen-fixing rhizobia and the plant. Microscopy observations have revealed differences in cellular organization between determinate and indeterminate nodules. Indeterminate nodules (e.g., M. truncatula) are organized into different zones that reflect differences in the developmental stages of the plant cells and their relationship with the symbiont. These zones include the active meristematic zone at the tip of the indeterminate nodule (zone I), the zone of infection by the bacteria (zone II), the nitrogen fixation zone (zone III), and the plant cell senescence zone (zone IV).</p><p>Recent molecular studies have revealed differences in transcriptomic and epigenomic profiles among the different zones of the indeterminate M. truncatula nodule <ref type="bibr">(Mergaert et al., 2020;</ref><ref type="bibr">Pecrix et al., 2022)</ref>. Such zonation does not exist in determinate nodules (e.g., soybean). Except for the central location of the senescing zone that emerges in 4-week-old nodules <ref type="bibr">(Yu et al., 2023)</ref> and the classification into infected and uninfected cells, there have been no reports of biologically different populations of infected cells, leading to the assumption that B. diazoefficiens-infected soybean cells are biologically homogenous <ref type="bibr">(Dupont et al., 2012)</ref>. This assumption has recently been challenged through the spatial profiling of over 100 metabolites using matrix-assisted laser desorption/ionization Fourier transform ion cyclotron resonance mass spectrometry (MS) imaging <ref type="bibr">(Stopka et al., 2017)</ref>. In our study, single-cellresolution transcriptomic analyses (Figure <ref type="figure">4</ref>) and M.C. experiments (Figure <ref type="figure">2</ref>) revealed cellular heterogeneity among rhizobia-infected cells in the infection zone of the soybean nodule. In addition to distinguishing uninfected (clusters J and K) and senescing cells (cluster I) on the basis of their transcriptomic profiles, we also observed different transcriptomic signatures between co-existing sub-populations of cells comprising the rhizobia-infected clusters F and G and cluster H (i.e., endoreduplication and strong transcriptomic activity in clusters F and G; strong induction of leghemoglobin gene expression in cluster H). This result suggests that only cells of cluster H are actively fixing atmospheric dinitrogen. Previous studies on 12-, 14-, and 21-dpi developing nodules did not reveal this cellular heterogeneity among B. diazoefficiens-infected cells. Specifically, our comparative analysis of gene expression in clusters at different nodule developmental stages revealed that the transcriptomic profiles of 12-, 14-, and 21-dpi nodule cell types were similar (Figure <ref type="figure">3D</ref>). However, drastic and global transcriptomic changes occurred in 28-dpi nodules, especially in B. diazoefficiens-infected cells (Figure <ref type="figure">3D</ref>). Therefore, we conclude that unique transcriptomic programs are activated in different sub-populations of B. diazoefficiens-infected cells later in nodule development.</p><p>By identifying different sub-populations of rhizobia-infected soybean cells, our study enables us to reconsider the definition of a plant cell type based not only on morphological or physiological differences but also on molecular attributes (i.e., gene expression) and the nature of environmental interactions (in this case, with rhizobia). From a functional perspective, our data suggest that a subset of rhizobia-infected cells are actively engaged in fixing atmospheric nitrogen (cluster H cells), whereas others are engaged in events of endoreduplication (cluster F and G cells). Our M.C. data suggest that these two populations of cells co-exist in the same nodule. For instance, the differential expression of Glyma.05G203100 between cluster H and clusters F and G (Figure <ref type="figure">4O</ref>) was nicely confirmed by M.C. with the detection of Glyma.05G203100 transcripts in a small population of the infected nodule cells (Figure <ref type="figure">4N</ref>), suggesting the coexistence of two populations of rhizobia-infected cells in the same nodule. These two populations share similar endoreduplication rates (Figure <ref type="figure">4P</ref>), supporting the fact that 4C endoreduplication of the soybean nodule cells is a prerequisite for rhizobial infection, as suggested previously <ref type="bibr">(Fan et al., 2022)</ref>. However, the physiology of these two populations (A) Donut chart of the distribution of putative GmFWL3 interaction partners according to their biological functions. (B) Heatmap representation of the expression of genes encoding the 321 proteins proposed to interact with GmFWL3. Gene expression is displayed for each of the 16 soybean root clusters (1-16; Figure <ref type="figure">1A</ref>) and 11 28-dpi soybean nodules (A-K; Figure <ref type="figure">2A</ref>). The set of genes highlighted in the red dashed square are preferentially expressed in clusters F, G, and H. (C) Dotplot representation of the expression of 10 soybean genes that interact with GmFWL3, including GmFWL3 itself. Nine are preferentially expressed in clusters F, G, and H (Supplemental Figure <ref type="figure">20</ref>). The percentage of nuclei expressing the gene of interest (circle size) and the mean expression (circle color) of the genes are shown. differs. We found that cells of the F and G clusters are characterized by active endoreduplication (Figure <ref type="figure">4B</ref> and <ref type="figure">4C</ref>) and higher transcriptional activity (Figure <ref type="figure">2B</ref>) but do not actively express leghemoglobin genes, suggesting that they are not actively fixing atmospheric nitrogen (Figure <ref type="figure">4D</ref>). The putative predominant role of the ET2 TF (Glyma.07G128700) in cells of clusters F and G, as reflected by the large number of predicted targets in our network (Figure <ref type="figure">5A</ref>), supports the presence of intense and broad transcriptional activity in cells of the F and G clusters. By contrast, and despite their endoreduplicated nature, cells of the H cluster are characterized by lower transcriptional activity and high nitrogen-fixing activity (Figure <ref type="figure">4B</ref> and <ref type="figure">4D</ref>). We hypothesize that the major function of F and G cells is to create the conditions for successful nitrogen fixation by generating a large pool of transcripts. Then, when entering active nitrogen fixation (cluster H), the infected cells will capitalize on this large pool of transcripts to maximize protein translation and metabolism, a requirement for active nitrogen fixation. The level of stress induced by the symbiosis (e.g., superoxide radicals and reactive oxygen species <ref type="bibr">[Dalton et al., 1991;</ref><ref type="bibr">Puppo et al., 1991;</ref><ref type="bibr">Davies and Puppo, 1992;</ref><ref type="bibr">Moreau et al., 1996;</ref><ref type="bibr">Puppo et al., 2005]</ref>), as reflected in the identification of multiple NAC TFs in the gene network of cluster H (Figure <ref type="figure">5B</ref>), ultimately leads to senescence of the infected cells (cluster I). We assume that these senescing cells belong to the central senescing zone of the soybean nodule that emerges as early as 4 weeks after bacterial inoculation <ref type="bibr">(Yu et al., 2023)</ref>. Thus, based on our knowledge, our study establishes for the first time the transcriptomes of different co-existing populations of rhizobiainfected soybean cells, including those engaged in senescence. Such cellular complexity of a determinate nodule is similar to that reported for the indeterminate nodule, but the latter is divided into various zones of infected cells (i.e., the rhizobia-infected zone II, the nitrogen fixation zone III, and the senescing zone IV).</p><p>Although soybean and Medicago nodules differ in morphology, we hypothesize that the soybean nodule is, from a cellular composition point of view, similar to the M. truncatula nodule. It is composed of infected but non-nitrogen-fixing cells (clusters F and G of the soybean nodule; zone II of the Medicago nodule), nitrogen-fixing cells (cluster H of the soybean nodule; zone III of the Medicago nodule), and senescing cells (cluster I of the soybean nodule; zone IV of the Medicago nodule). Our data (Figure <ref type="figure">3</ref>) and those of <ref type="bibr">Yu et al. (2023)</ref> support the coexistence of these cells. We hypothesize that the purpose of the co-existence of these different populations is to ensure a steady supply of nitrogen to the plant by balancing the energy cost associated with nitrogen fixation with the impact of nitrogen fixation on plant cell viability (i.e., nitrogen fixation triggers the formation of various oxidizing species).</p><p>The single-nucleus transcriptome atlases of the root and nodule also help us to refine the identification of genes that control biological processes. For instance, in addition to their critical roles in the infection of plant cells by different types of symbiotic and pathogenic microbes, including bacteria, fungi, and viruses, microdomain-associated proteins are also central to control of the nodulation process (e.g., flotillins, remorins, and FWL protein-coding genes have previously been reported to control legume nodulation) <ref type="bibr">(Haney and Long, 2010;</ref><ref type="bibr">Thibivilliers et al., 2020a;</ref><ref type="bibr">Yu, 2020)</ref>. Here, we identified a new microdomainassociated protein-coding gene, GmFWL3, that is preferentially expressed in infected nodule cells <ref type="bibr">(i.e., clusters F, G, and H)</ref>. Although the expression profiles of GmFWL1 and GmFWL3 overlap (Figure <ref type="figure">6B</ref>), the subcellular localization of the GmFWL3 protein in infected nodule cells is broader than that reported previously for GmFWL1 <ref type="bibr">(Libault et al., 2010b)</ref>. Specifically, in addition to its localization at the plasma membrane of infected nodule cells, GmFWL3 was also found in the symbiosome membrane, as reported previously <ref type="bibr">(Clarke et al., 2015)</ref>. This observation, together with the significant decrease in bacterial infection of nodule cells upon GmFWL3 knockout, suggests that the microdomain fraction of the symbiosome membrane plays a critical role in the communication between plant cells and bacteroids. In addition to this first role, we assume that GmFWL3 might also have another symbiotic function that requires its high expression in infected cells of the soybean nodule. Specifically, considering previous reports that microdomain-associated proteins regulate the transport of different biochemical compounds such as sugars and auxin <ref type="bibr">(Teale et al., 2006;</ref><ref type="bibr">K re cek et al., 2009)</ref>, we hypothesize that GmFWL3 contributes to nutrient transport between bacteroids and plant cells throughout the lifetime of the symbiosis. The identification of other microdomain-associated proteins that interact with GmFWL3, including GmFWL1, and their coexpression in the same cell type of the nodule, support their interaction and the formation of a protein network in the microdomain fraction of biological membranes of the rhizobiainfected cells. Our findings highlight the ability of single-cell genomics to dissect complex biological processes, such as refining the biological concept of ''cell type'' by including highresolution molecular attributes in this definition.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>METHODS</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Bacterial culture</head><p>Escherichia coli, Agrobacterium rhizogenes (K599), and Agrobacterium tumefaciens (GV3101) strains were grown in LB medium supplemented with appropriate antibiotics at 37 C for E. coli and 30 C for Agrobacterium strains. B. diazoefficiens USDA110 was grown in HM medium supplemented with 50 mg/mL chloramphenicol. B. diazoefficiens cultures were grown for 3 days and pelleted at 4000 g for 10 min, then washed and diluted to an OD 600nm of 0.1 in nutritive NPNS solution for inoculation <ref type="bibr">(Broughton and Dilworth, 1971)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Plant material</head><p>Soybean (Glycine max Williams 82) seeds were sterilized as described previously <ref type="bibr">(Pingault et al., 2018)</ref>. The seeds were then placed on agar B&amp;D medium in the absence of nitrogen and germinated in a growth chamber. The roots of 6-day-old plants (i.e., the meristematic, elongation, and maturation zones of the root, which include emerging and fully elongated root hair cells) were collected and processed to generate the sNucRNA-seq libraries. To isolate mature nodules, 3-dayold seedlings were germinated on B&amp;D agar medium without nitrogen and inoculated with a suspension of B. diazoefficiens USDA 110 (OD 600nm = 0.1). After 72 h of incubation in the dark, the seedlings were transferred to a vermiculite:perlite mix (3:1) and grown in the growth chamber (16 h light/8 h dark) at 20 C-26 C. Mature nodules were collected 28 days after bacterial inoculation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Nucleus isolation, sNucRNA-seq library preparation, and sequencing</head><p>For nucleus isolation, roots and nodules were chopped and passed through 30-and 40-mm cell strainers as described previously <ref type="bibr">(Thibivilliers et al., 2020b)</ref>. The filtered nuclei were purified by cell sorting using a FACS Aria II 603 cell sorter (BD Biosciences). The three sNucRNA-seq root libraries and two nodule libraries were constructed following the protocol of the Chromium Single Cell 3 0 Library &amp; Gel Bead Kit v3.1 (10X Genomics). Sequencing of single-indexed, paired-end libraries was performed on the Illumina NovaSeq 6000 platform according to the 10X Genomics recommendations (see Supplemental Table <ref type="table">11</ref> for detailed information).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Cloning and molecular constructs</head><p>Conventional cloning procedures All constructs were generated using either classical restriction-enzyme ligation or Gateway cloning strategies (<ref type="url">www.lifetechnologies.com</ref>). For Gateway cloning, GmFWL3 cDNA and FWL3-GFP were cloned into the pDONR-Zeo vector by the BP clonase reaction. Constructs were sequenced to confirm the integrity of the cloned genes. The GmFWL3-pDONR plasmid was used with the pMDC43 and pMDC83 destination vectors in an LR reaction to generate p35S::GFP-FWL3 and p35S::FWL3-GFP translational fusion constructs. The p35S::CFP-AtSUN1 construct was described previously <ref type="bibr">(Graumann et al., 2010)</ref>.</p><p>To generate myc-tagged chimeric proteins, AttR1-CmR-ccdb-AttR2-10xMyc and 10xMyc-AttR1-CmR-ccdB-AttR2 cassettes were amplified from pGWB20 and pGWB21 <ref type="bibr">(Nakagawa et al., 2007)</ref>, respectively, and cloned in place of the HA tag in the CGT3304 plasmid using BamHI and Eco53kI restriction sites. After DNA sequence confirmation, the resulting promoter FMV::AttR1-CmR-ccdb-AttR2-10xmyc-tnos and pFMV:10xmyc-AttR1-CmR-ccdB-AttR2-tnos cassettes were excised by SbfI restriction enzyme digestion and ligated into AKK1467B at the SbfI restriction site, thereby creating AKK1467B-10myc-GW and AKK1467B-GW-10myc Gateway-compatible destination plasmids (see Supplemental Figure <ref type="figure">21</ref>). The GmFWL3-pDoNR-Zeo plasmid was used in an LR reaction to clone GmFWL3 cDNA into the modified AKK1467B-10myc-GW and AKK1467B-GW-10myc Gateway destination plasmids, generating pFMV:10myc-GmFWL3 and pFMV::GmFWL3-10myc constructs. The pFMV:10myc-FWL3, pFMV::FWL3-10myc, p35S::FWL3-GFP, and p35S::GFP-FWL3 constructs and their respective controls, pFMV:10Myc and p35S::GFP, were transformed into A. rhizogenes (strain K599) for hairy root transformation. CRISPR-Cas9 design and screening for mutation by band shift and sequencing The guide RNAs used to knock out selected genes via CRISPR-Cas9 technology were designed using the guide RNA designer website <ref type="bibr">(Doench et al., 2014)</ref>. Two GmFWL3 target sequences (referred to hereafter as GmFWL3-T1 and T2) were independently cloned into the Esp3I and BsaI sites of the pAH595 guide RNA entry vector under the control of the AtU6 and At7SL promoters, respectively, using the Golden Gate method <ref type="bibr">(Curtin et al., 2018)</ref> to create the pAH595-GmFWL3-T1-T2 donor plasmid. The pAH595-GmFWL3-T1-T2 entry vector was then used with the pNJB184-CAS9 entry vector in a two-fragment multi-site Gateway LR clonase reaction. The cassettes were recombined into the pUB-GW-GFP binary vector <ref type="bibr">(Maekawa et al., 2008)</ref>, which carries a GFP selectable marker for screening transgenic FWL3-CRISPR-Cas9 hairy roots <ref type="bibr">(Xie and Yang, 2013)</ref>. The empty pAH595 donor plasmid combined with pNJB184-CAS9 in an LR reaction was used as the control. The resulting pLjUB::Cas9-pU6/At7SL::GmFWL3-T1-T2 binary plasmid (hereafter CAS9/GmFWL3-T1-T2) was transformed into A. rhizogenes (K599) for hairy root transformation and nodule phenotyping.</p><p>To characterize the nature of the mutations induced by CRISPR-Cas9 in the FWL3 gene, the genomic DNA of transgenic soybean roots expressing GFP was extracted. The regions spanning the target sites were amplified by PCR using the GFWL3-5 0 utr forward (5 0 -CCAAGTCCAATAAC-TATGCTTGAG-3 0 ) and reverse primers (5 0 -TCAACGGCTCATGCCC-3 0 ) and then sequenced for analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Plant transformation and confocal laser scanning microscopy</head><p>Tobacco leaf infiltration and protoplast isolation Nicotiana benthamiana leaves were co-infiltrated with A. tumefaciens (GV3101) expressing the virus RNA-silencing suppressor protein HC-Pro to enhance expression of the transgene, together with the following constructs: p35S::GFP-GmFWL3, p35S::GmFWL3-GFP, and p35S::GFP. Three days after infiltration, the tobacco leaf cells were imaged using a Nikon A1 confocal microscope. To produce transgenic tobacco leaf protoplasts, infiltrated epidermal leaf cells were incubated in MKM medium (9% mannitol, 0.037% KCl, 0.2 M MOPS [pH 6.0] supplemented with 0.05% driselase, 0.02% macerozyme R10, and 0.1% onozuka R10 [all Sigma, <ref type="url">http://www.sigmaaldrich.com/</ref>]) for 3 h in the dark at room temperature. Tobacco leaf epidermal protoplasts expressing the GFP-FWL3 and FWL3-GFP translational fusions, as well as the GFP control, were imaged using a Nikon A1 confocal microscope. Plasma membrane staining was performed by infiltrating 5 mM of FM4-64 (SynaptoRed C2, Biotium no. 70020) prior to microscopy.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Soybean hairy root transformation</head><p>Eleven-day-old soybean Williams 82 plants were used for hairy root transformation <ref type="bibr">(Pingault et al., 2018)</ref>. Three days after transformation, shoot explants on rockwool cubes were watered with nutritive NPNS solution and allowed to grow for an additional 10 days. The plants were then transferred to a 3:1 autoclaved mixture of vermiculite and perlite and grown for an additional 14 days to allow plants to develop roots. Plants were then inoculated with USDA110. Thirty-day-old transgenic nodules, characterized by the expression of the GFP reporter, were collected under a Nikon SMZ25 epifluorescence stereoscope.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Co-immunoprecipitation assay</head><p>Co-immunoprecipitation experiments were performed as described previously <ref type="bibr">(Qiao et al., 2017a)</ref> using transgenic nodules isolated from GFPpositive transgenic roots (see above). Total protein extracts for coimmunoprecipitation assays were obtained by grinding transgenic nodules in protein extraction buffer (50 mM Tris-MES [pH 7.5], 300 mM sucrose, 150 mM NaCl, 10 mM potassium acetate, 5 mM EDTA, Sigma plant protease inhibitor cocktail, and 1% Triton X-100). After a 30-min incubation in protein extraction buffer, the proteins were filtered through a 40-mm filter  and centrifuged (15 000 g, 10 min, 4 C). Prior to the co-immunoprecipitation assay, western blot assays were performed to detect the tagged proteins using anti-Myc-HRP antibodies (Fisher no. R951-25) diluted 1:1000 in TBS-0.05% Tween20-2% skimmed milk. Coimmunoprecipitated proteins were isolated by applying total protein extracts to 50 mL of anti-myc Tag MicroBeads (Miltenyi Biotech, no. 130-091-284) on ice for 30 min and separated through a mColumn (Miltenyi Biotech no. 130-042-701) in the magnetic field of the mMACS Separator system <ref type="bibr">(Miltenyi Biotech,</ref> according to the manufacturer's protocol. Samples were eluted with SDS-PAGE sample buffer for further analysis as described below. Three replicates each for both N-and C-terminal c-myc fusions to GmFWL3 were compared with three replicates of c-myc alone as controls.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>MS and identification of GmFWL3-binding protein partners</head><p>Co-immunoprecipitated proteins were denatured at 95 C for 5 min and loaded onto a 10% Bolt Bis-Tris Plus gel. After brief electrophoresis to concentrate the proteins at the top of the gel, the proteins were fixed and stained with colloidal Coomassie blue (Sigma). The areas of gel that contained the co-immunoprecipitated proteins were excised and subjected to reduction and alkylation with DTT and iodoacetamide, respectively, then washed with ammonium bicarbonate/acetonitrile to remove stain and SDS.</p><p>Trypsin digestion was carried out overnight at 37 C. Peptides were extracted from the gel pieces, dried down, and resuspended in 0.1% trifluoroacetic acid. Samples were desalted using an Oasis HLB mElution solid-phase extraction plate (Waters, Milford, MA). Eluates were dried down and resuspended in 2.5% acetonitrile and 0.1% formic acid. Peptides were then run by nanoLCMS/MS using a 2-h gradient on a 0.075 3 250 mm CSH C18 column (Waters) feeding into a Q-Exactive HF mass spectrometer.</p><p>All MS/MS samples were analyzed using Mascot (Matrix Science, London, UK; version 2.6.1). Mascot was set up to search the Glycine max protein database (Wm82.a4.v1l; 92 226 records) for tryptic peptides. Mascot was searched with a fragment ion mass tolerance of 0.060 Da and a parent ion tolerance of 10.0 PPM. Deamidated asparagine and glutamine, oxidized methionine, and carbamidomethylated cysteine were specified as variable modifications in Mascot. Scaffold (version 4.8.9, Proteome Software, Portland, OR) was used to validate MS/MS-based peptide and protein identifications. Peptide identifications were accepted with a probability of 80% or greater and &lt;1% FDR by the Peptide Prophet algorithm <ref type="bibr">(Keller et al., 2002)</ref> with Scaffold delta-mass correction. Protein identifications were accepted with a probability greater than 99.0%, &lt;1% FDR, and at least two peptides per protein.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Immunogold labeling TEM</head><p>Thirty-dpi transgenic nodules expressing the pFMV:myc-GmFWL3, pFMV::GmFWL3-myc, and pFMV:myc transgenes were isolated on the basis of GFP observation under a Nikon SMZ25 fluorescence stereoscope. The nodules were immediately fixed in 4% glutaraldehyde, 1% paraformaldehyde, and 0.2 M sodium cacodylate buffer (pH 7.2) for 1 h at room temperature. Fixed nodules were cut in half and stored at 4 C overnight in the buffer. For immunogold labeling, samples were embedded in LR-White resin and then sectioned to generate 100-nm cross-sections. The sections were collected onto nickel grids, blocked in 13 PBS-Tween-BSA (0.05% [v/v] Tween 20 and 3% BSA]) for 30 min, and rinsed in PBS-0.05% Tween (PBST). The grids were then labeled with anti-myc antibodies (R950-25, ThermoFisher) diluted 1:50 in PBS-0.05% Tween-1% BSA for 1 h. After three washes in PBST, the labeled samples were incubated for 1 h with the secondary antibody conjugated with 10 nm colloidal gold (A-31561, ThermoFisher) and diluted 1:100 in PBS-0.05% Tween-1% BSA. The sections were rinsed 3 times in PBS-0.05% Tween 20, then once in deionized water. Electron microscopy images were collected with a Hitachi H7500 TEM operated at 80 kV, focusing on rhizobia-infected cells of the soybean nodule. When necessary, sections were stained with uranyl acetate and lead citrate. For each transgene, three independent biological replicates were processed and observed under the transmission electron microscope.</p><p>SNucRNA-seq data pre-processing, integration, and clustering Each sNucRNA-seq library was processed individually using 10X Genomics Cell Ranger software v6.1.1.0 for demultiplexing and alignment to the soybean reference genome from the Ensembl Plants database (Glycine_max_v2.1.52; <ref type="url">http://ftp.ensemblgenomes.org/pub/  plants/release-52/fasta/glycine_max/</ref>). Background contamination was subtracted using SoupX after read alignment <ref type="bibr">(Young and Behjati, 2020)</ref>, and doublets were filtered out using the DoubletDetection prediction method <ref type="bibr">(Gayoso and Shor, 2022)</ref>. Finally, we applied a minimum threshold of 500 UMIs to remove nuclei with lower transcript content. Upon normalization, integration anchors were defined for the combined set of three sNucRNA-seq root datasets and two sNucRNA-seq nodule datasets using Seurat V4 <ref type="bibr">(Hao et al., 2021)</ref>. The dimensional reduction was performed using the UMAP method with the first 40 principal components, selecting the top 2000 variable genes for clustering using the FindClusters method in Seurat V4. We used the Seurat object for the soybean nodule UMAP to generate expression distribution plots of the gene set using the RidgePlot function in Seurat V4. The DotPlot function in Seurat V4 was used to generate the DotPlot expression figures.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Cell type annotation using M.C.</head><p>To annotate the soybean root and nodule cell clusters, we identified the most highly expressed genes with specific expression for each cluster using the FindAllMarkers function in Seurat V4. Probes were designed by Resolve Biosciences (Monheim am Rhein, Germany) and hybridized against 10-mm cross-sections of fixed and paraffin-embedded roots and nodules. Upon hybridization, the cross-sections were stained with DAPI and calcofluor white, and microscopy observations were performed to reveal the positions of nuclei and cells, respectively. The microscopy images with transcript locations were analyzed using the Molecular Cartography plugin for ImageJ analysis software provided by Resolve Biosciences.</p><p>For the root, the epidermal marker genes were Glyma.19G255500, CYP93A1, Glyma.06G259400, G4DT, Glyma.09G099900, Glyma.10G07 0200, Glyma.01G156200, Glyma.07G130800, Glyma.20G061300, Glyma. 02G149100, Glyma.04G010600, and Glyma.17G133100; the cortical marker genes were Glyma.06G235500, Glyma.11G221200, Glyma. 09G216800, and Glyma.15G169100; the endodermal marker genes were Glyma.14G218700 and Glyma.16G106800; the pericycle marker genes were Glyma.02G003700, Glyma.05G023700, Glyma.11G078300, Glyma.08G125800, and Glyma.09G127000; the xylem marker genes were Glyma.15G245800, Glyma.04G063800, Glyma.06G065000, Glyma. 13G334500, Glyma.15G040000, Glyma.18G197400, and Glyma.15G179 500; and the phloem marker genes were Glyma.07G006500, Glyma. 05G216000, Glyma.15G274200, Glyma.11G243100, and Glyma.12G154 300 (see Supplemental Figures 2 and 3 for details).</p><p>For the nodule, the inner/outer cortical marker genes were Glyma.16G039800, Glyma.19G255500, CYP93A1, Glyma.06G259400, G4DT, Glyma.01G156200, Glyma.20G061300, and Glyma.03G079500; the sclereid layer marker genes were Glyma.04G063800, Glyma.06G0 65000, Glyma.13G334500, and Glyma.15G040000; the vascular endodermis marker genes were Glyma.04G218700, Glyma.14G227200, Glyma.16G106800, and Glyma.20G151700; the vascular bundle marker genes were Glyma.06G256000, Glyma.10G139200, Glyma.02G003700, Glyma.07G231500, Glyma.18G062100, Glyma.08G125800, Glyma.11G0 78300, Glyma.09G127000, Glyma.13G334500, and Glyma.15G040000; the infected cell marker genes were Glyma.17G195900 (RIM), Glyma.01G164600, Glyma.15G2100100, and Glyma.05G216000; and the uninfected cell marker genes were Glyma.06G235500 and Glyma.06G002000 (see Supplemental Figure <ref type="figure">6</ref> for details). We also used B. diazoefficiens probes against BAC45727, BAC46169, BAC47034, BAC48395, BAC51072, BAC51722, BAC52602, BAC52793, and BAC52805 to annotate the infected nodule cells (see Supplemental Figure <ref type="figure">7</ref> for details).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Cell type annotation using orthologous gene markers</head><p>To support the annotations of the soybean root cell clusters, we identified soybean orthologs of functionally validated cell-type-specific gene markers from roots of A. thaliana <ref type="bibr">(Farmer et al., 2021)</ref> and M. truncatula (Cervantes-Pe &#180;rez et al., 2022) (Supplemental Table <ref type="table">3</ref>). Gene orthology was based on identification of syntenic regions between the genome of G. max and those of A. thaliana and M. truncatula using CoGe (<ref type="url">https://  genomevolution.org/coge/</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>UMAP visualization</head><p>For visualization, all sNucRNA-seq libraries for the root, the nodule, and the integration of both were combined using the Cell Ranger aggr function from 10X Genomics, and Loupe software from 10X Genomics was used to visualize the integrations.</p><p>Single-cell RNA-seq atlas of soybean root organs</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Plant Communications Differential gene expression and gene ontology analyses</head><p>To identify DEGs between clusters, we used DEsingle software <ref type="bibr">(Miao et al., 2018)</ref>, a zero-inflated negative binomial distribution method <ref type="bibr">(Wang et al., 2019)</ref>, with the raw read counts and thresholds of p &lt; 0.05 and fold-change greater than 1.5. Gene ontology enrichment analyses were performed on the DEGs using the PlantRegMap GO Enrichment tool with a threshold of p % 0.01 <ref type="url">http://plantregmap.gao-lab.org/go.php</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Comparison of soybean sNucRNA-seq and bulk RNA-seq</head><p>To evaluate the depth and sensitivity of the soybean root and nodule single-nucleus transcriptome atlases, we compared our pseudo-bulk sNucRNA-seq datasets with previously published root and nodule bulk RNA-seq datasets <ref type="bibr">(Libault et al., 2010c)</ref>. Using the legume information system database, we extracted bulk expression datasets (2022/11/14; <ref type="url">https://data.legumeinfo.org/Glycine/max/expression/Wm82.gnm2.ann1.  expr.Wm82.Libault_Farmer_2010/</ref>; identifiers SRR037385, SRR037386, and SRR037387 for the nodule, root tip, and entire root, respectively) and then compared the numbers of expressed genes between the bulk and pseudo-bulk RNA-seq libraries.</p><p>MDS analysis of the soybean root and nodule transcriptomes and 12-, 14-, 21-, and 28-dpi soybean nodules</p><p>To evaluate the level of similarity across the single-nucleus transcriptomes of soybean roots (16 cell clusters) and nodules (11 cell clusters) and developmental stages of the soybean nodule (i.e., 12-, 14-, 21-, and 28-dpi nodules), we first defined the integration anchors for the combined set of sNucRNA-seq datasets of the root and nodule together using Seurat V4 <ref type="bibr">(Hao et al., 2021)</ref> and then performed an MDS analysis using the ggfortify library in R (version 4.2.2). Specifically, classical multidimensional scaling was performed to calculate a distance matrix between the different objects.</p><p>Reanalysis of single-cell RNA-seq datasetsfrom M. truncatula nodules Upon mining single-cell RNA-seq datasets from M. truncatula nodules (i.e., SAMC899255 and SAMC899256 from the National Genomics Data Center), we individually processed both datasets using the 10X Genomics Cell Ranger v6.1.1.0 pipeline to map the sequencing reads against the M. truncatula reference genome (<ref type="url">https://medicago.toulouse.inra.fr/  MtrunA17r5.0-ANR/</ref>). After removing the ''MIX'' cluster 0 as described previously <ref type="bibr">(Ye et al., 2022)</ref>, we used the same analytical methods described above. For visualization purposes, we generated expression Ridge plots for selected M. truncatula and soybean genes using the RidgePlot function in Seurat V4.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>GRN inference</head><p>To predict TF-target interaction pairs from gene expression data and infer the GRN, we used our 28-dpi nodule sNucRNA-seq data and the GENIE3 tool <ref type="bibr">(Huynh-Thu et al., 2010)</ref>. In this analysis, we defined cell subsets as cluster F and G cells and cluster H cells; gene subsets encompassed 2000 highly variable genes (specific to each cell subset), together with SNF-related genes <ref type="bibr">(Roy et al., 2020)</ref> and all soybean TFs <ref type="bibr">(Feng et al., 2022;</ref><ref type="bibr">Wang et al., 2023a)</ref>. Duplicate genes and those with zero expression across all cells within a subset were excluded, resulting in 3133 genes for F and G and 3276 genes for H. For downstream analyses on the inferred networks, within each network, we used the top 50 000 interactions based on the importance measure.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>Plant Communications 5, 100984, August 12 2024 &#170; 2024 The Author(s).</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_1"><p>Plant Communications 5, 100984, August 12 2024 &#170; 2024 The Author(s).Plant CommunicationsSingle-cell RNA-seq atlas of soybean root organs</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_2"><p>Plant Communications 5, 100984, August 12 2024 &#170; 2024 The Author(s).</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_3"><p>Single-cell RNA-seq atlas of soybean root organs Plant Communications</p></note>
		</body>
		</text>
</TEI>
