<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Oligotrophic waters of the Northwest Atlantic support taxonomically diverse diatom communities that are distinct from coastal waters</title></titleStmt>
			<publicationStmt>
				<publisher>Wiley</publisher>
				<date>12/01/2023</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10514272</idno>
					<idno type="doi">10.1111/jpy.13388</idno>
					<title level='j'>Journal of Phycology</title>
<idno>0022-3646</idno>
<biblScope unit="volume">59</biblScope>
<biblScope unit="issue">6</biblScope>					

					<author>Samantha P Setta</author><author>Sarah Lerch</author><author>Bethany D Jenkins</author><author>Sonya T Dyhrman</author><author>Tatiana A Rynearson</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[<title>Abstract</title> <p>Diatoms are important components of the marine food web and one of the most species‐rich groups of phytoplankton. The diversity and composition of diatoms in eutrophic nearshore habitats have been well documented due to the outsized influence of diatoms on coastal ecosystem functioning. In contrast, patterns of both diatom diversity and community composition in offshore oligotrophic regions where diatom biomass is low have been poorly resolved. To compare the diatom diversity and community composition in oligotrophic and eutrophic waters, diatom communities were sampled along a 1,250km transect from the oligotrophic Sargasso Sea to the coastal waters of the northeast US shelf. Diatom community composition was determined by amplifying and sequencing the 18S rDNA V4 region. Of the 301 amplicon sequence variants (ASVs) identified along the transect, the majority (70%) were sampled exclusively from oligotrophic waters of the Gulf Stream and Sargasso Sea and included the genera<italic>Bacteriastrum</italic>,<italic>Haslea</italic>,<italic>Hemiaulus</italic>,<italic>Pseudo</italic>‐<italic>nitzschia</italic>, and<italic>Nitzschia</italic>. Diatom ASV richness did not vary along the transect, indicating that the oligotrophic Sargasso Sea and Gulf Stream are occupied by a diverse diatom community. Although ASV richness was similar between oligotrophic and coastal waters, diatom community composition in these regions differed significantly and was correlated with temperature and phosphate, two environmental variables known to influence diatom metabolism and geographic distribution. In sum, oligotrophic waters of the western North Atlantic harbor diverse diatom assemblages that are distinct from coastal regions, and these open ocean diatoms warrant additional study, as they may play critical roles in oligotrophic ecosystems.</p>]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>INTRODUCTION</head><p>Diatoms are a species-rich class of eukaryotic phytoplankton that generates ~20% of global primary production <ref type="bibr">(Falkowski, 1998;</ref><ref type="bibr">Field, 1998;</ref><ref type="bibr">Tr&#233;guer et al., 2018)</ref>. The impact of diatom primary production on biogeochemical cycling and food web dynamics is strongly influenced by the taxonomic composition of diatom assemblages <ref type="bibr">(Nelson &amp; Brzezinski, 1997)</ref>, which can vary over both space and time <ref type="bibr">(Borkman &amp; Smayda, 2009;</ref><ref type="bibr">Karentz &amp; Smayda, 1998)</ref>. In eutrophic coastal waters, taxonomically diverse diatom assemblages sustain high levels of biomass that fuel coastal ecosystems <ref type="bibr">(Bracher et al., 2009;</ref><ref type="bibr">Smetacek, 2012)</ref>. In persistently oligotrophic regions of the global ocean, diatom biomass is generally low <ref type="bibr">(Nelson et al., 1995;</ref><ref type="bibr">Tr&#233;guer et al., 2018)</ref> except when they form intermittent blooms, which can contribute significantly to organic matter export in the open ocean <ref type="bibr">(Brzezinski &amp; Nelson, 1995;</ref><ref type="bibr">Karl et al., 2012;</ref><ref type="bibr">Nelson &amp; Brzezinski, 1997)</ref>.</p><p>Diatoms that survive in oligotrophic habitats have a variety of adaptations that allow them to grow despite low levels of macronutrients. These adaptations include the abilities to respond quickly to episodic nutrient pulses <ref type="bibr">(Alexander et al., 2015;</ref><ref type="bibr">Krause et al., 2010;</ref><ref type="bibr">McGillicuddy et al., 2007)</ref>, to modulate a diverse set of metabolic pathways <ref type="bibr">(Armbrust, 2009)</ref>, to reduce nutrient quotas <ref type="bibr">(Van Mooy et al., 2009;</ref><ref type="bibr">Whitney et al., 2011)</ref>, to utilize organic nutrient sources <ref type="bibr">(Dyhrman et al., 2012)</ref>, and to establish associations with diazotrophs <ref type="bibr">(Carpenter et al., 1999;</ref><ref type="bibr">Foster &amp; Zehr, 2019;</ref><ref type="bibr">Villareal &amp; Lipschultz, 1995)</ref>. Small diatoms have an advantage in low-nutrient waters due to their high surface-areato-volume ratios <ref type="bibr">(Finkel et al., 2010)</ref>, and some larger diatoms vertically migrate through the water column to access nutrients at depth <ref type="bibr">(Villareal &amp; Lipschultz, 1995)</ref>. These types of metabolic and physiological adaptations allow some diatom species to persist as sparse seed populations in oligotrophic regions and to achieve high growth rates and biomass in response to sporadic nutrient inputs <ref type="bibr">(Goldman, 1993;</ref><ref type="bibr">Krause et al., 2009</ref><ref type="bibr">Krause et al., , 2010))</ref>.</p><p>In contrast to the well-documented metabolic strategies that diatoms use to survive in oligotrophic regions, the taxonomic composition of oligotrophic diatom communities and the environmental correlates of their distributions have been less described, particularly in the oligotrophic western North Atlantic. The two quantitative records of diatom species composition that exist for the Sargasso Sea noted a diverse set of species belonging to both centric (Bacteriastrum, Chaetoceros, Leptocylindrus, Rhizosolenia, Skeletonema, and Thalassiosira) and pennate (Nitzschia) genera <ref type="bibr">(Hulburt et al., 1960;</ref><ref type="bibr">Hulburt &amp; Rodman, 1963)</ref>. Those results have been supported by other studies that (a) used light microscopy, either applied opportunistically (e.g., personal communication notes in <ref type="bibr">Bidigare et al., 1990;</ref><ref type="bibr">Malone et al., 1993)</ref> or with a focus on genus-level designation <ref type="bibr">(Benitez-Nelson et al., 2007;</ref><ref type="bibr">Krause et al., 2010;</ref><ref type="bibr">McGillicuddy et al., 2007)</ref> and (b) used rRNA approaches to obtain genus-level designations <ref type="bibr">(Lampe et al., 2019)</ref>. Interestingly, <ref type="bibr">Hulburt et al. (1960)</ref> remarked on the large number of small centric diatoms that occurred during spring but which could not be identified with light microscopy. However, the composition of those diatoms has not been further explored. Models have generated contrasting predictions that climate change will either increase <ref type="bibr">(Busseni et al., 2020)</ref> or decrease <ref type="bibr">(Henson et al., 2021)</ref> diatom diversity in oligotrophic open ocean regions of the North Atlantic. These predictions are difficult to evaluate given our limited understanding of North Atlantic diatom taxonomic composition and how their distributions are correlated with environmental conditions.</p><p>To address the paucity of data on diatom community composition in the oligotrophic Sargasso Sea and Gulf Stream, we examined diatom 18S rDNA gene sequence variation from biomass collected along a transect in the western North Atlantic, from high-nutrient coastal waters to the low-nutrient Sargasso Sea region of the North Atlantic Subtropical Gyre, the oligotrophic gyre that is expanding most rapidly due to climate change (4.3% per year; <ref type="bibr">Polovina et al., 2008)</ref>. Using high-throughput amplicon sequencing and polymerase chain reaction primers designed to recover diatom diversity with more sensitivity than universal eukaryotic primers <ref type="bibr">(Zimmermann et al., 2011)</ref>, we examined the taxonomic composition of diatom communities along the transect and evaluated whether diatom community composition and richness corresponded to gradients in nutrients and temperature between the Sargasso Sea and waters on the US Northeast Shelf.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>METHODS</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Sample collection</head><p>Samples were collected from 17 locations in the western North Atlantic aboard the R/V Atlantic Explorer from May 2 to 15, 2018 (AE1812; Figure <ref type="figure">1</ref>). At each sampling location, water was collected from the surface mixed layer at 25 m and the deep chlorophyll maximum (DCM), except when bottom depth was less than 25 m in shelf waters (Data S1 in the Supporting Information). Light and temperature data were obtained using a Sea Bird Scientific SBE 911plus CTD with rosette-mounted sensor that measured photosynthetically active radiation (CPAR). Dissolved inorganic nitrogen (DIN) and soluble reactive phosphorus (SRP) concentrations (hereafter phosphate) were obtained by filtering whole seawater through GF/F (~0.7 &#956;m) filters (Whatman) and measuring flowthrough water on a Seal Analytical AA3 HR nutrient autoanalyzer at the University of Hawaii School of Ocean and Earth Science and Technology Laboratory for Analytical Biogeochemistry. Samples with phosphate concentrations below 125 nM were also measured using the Magnesium Induced Coprecipitation (MAGIC) protocol <ref type="bibr">(Karl &amp; Tien, 1992)</ref>. Chlorophyll a concentrations were measured in triplicate by filtering whole seawater over GF/F (~0.7 &#956;m) filters (Whatman). Chlorophyll a was immediately extracted from filters using 100% denatured ethanol over 12 h <ref type="bibr">(Jespersen &amp; Christoffersen, 1987)</ref> and measured on a 10-AU Fluorometer (Turner); concentrations were then calculated using a predetermined calibration curve.</p><p>Seawater samples for community composition analyses using rDNA genes were pre-filtered over mesh netting of either 153 &#956;m (stations 1B-9A) or 200 &#956;m (stations 11A-20A) to exclude large macro-zooplankton grazers. The remaining biomass was collected on either 0.2&#956;m filters (10 cm 2 polyethersulfone, Millipore Sterivex or 25 mm polyester, Sterlitech) or 5.0&#956;m filters (25 mm PETE, Sterlitech), in volumes of 2-10 L depending on biomass (Data S1). Filter pore-sizes were chosen to pair with filtering for other environmental and molecular analysis, and we therefore tested for significant differences between size fractions before combining all samples for a more robust assessment of diatom richness (see Statistical Analyses section). By examining all diatoms in the &lt;153-200&#956;m size fraction, we captured the full size range of diatom diversity. Filters were flash frozen in liquid nitrogen and stored at -80&#176;C until DNA extraction.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Analysis of community composition using DNA</head><p>DNA was extracted from filtered biomass using the Blood &amp; Tissue Kit (Qiagen) following the manufacturer's animal tissues protocol with modifications. For samples collected using the 0.2&#956;m pore size Sterivex filters, standard Sterivex extraction methods were followed by injecting lysis buffer and beads directly into the cartridge and incubating <ref type="bibr">(Chestnut et al., 2014;</ref><ref type="bibr">Cruaud et al., 2017)</ref>. Briefly, cell lysis was conducted by injecting 1 mL ATL buffer, 90 &#956;L of Proteinase K, and 200 &#956;L of 400&#956;m sterilized zirconium beads with a 3-mL Luer lock sterile syringe before capping and vortexing the samples at the highest setting for 4 min. Samples were then incubated at 56&#176;C for 1 h, vortexing every 15 min. After incubation, the lysis solution was removed from filter capsules using a 3-mL Luer lock sterile syringe and placed into a sterile 5-mL conical tube with 1 mL AL buffer, vortexed, and incubated at 56&#176;C for 10 min. Finally, 1 mL of 96%-100% ethanol was added to samples, followed by vortexing for 15 s and pipetting into the DNEasy spin column (Qiagen). Remaining steps followed the animal tissues (Qiagen) protocol. For samples collected on 0.2&#956;m and 5&#956;m polyester and polyethersulfone filters, cells were lysed using a Mini-Beadbeater-96 (BioSpec) at top-speed (36 oscillations &#8226; s -1 or 2,100 rpm) for 60 s with 540 &#956;L of ATL buffer and ~40 &#956;L of 400&#956;m sterilized zirconium beads. Afterward, 60 &#956;L of Proteinase K was added to each sample, and samples were then vortexed for 15 s and incubated at 56&#176;C for 1 h, vortexing every 15 min. Remaining steps followed the animal tissues (Qiagen) protocol.</p><p>A 420-bp fragment of the 18S rDNA V4 hypervariable region was amplified from extracted DNA using universal diatom primers <ref type="bibr">(Zimmermann et al., 2011)</ref> with Illumina-specific adapters <ref type="bibr">(Rynearson et al., 2020)</ref>. Reaction mixtures included 0.3 &#956;M each of the primers D512for and D978rev <ref type="bibr">(Zimmermann et al., 2011)</ref>, 50% v/v HIFI Master Mix (KAPA), 3-130 ng template DNA, and DNase free water for a total reaction volume of 25 &#956;L. Sequences were amplified in triplicate using a modification of the thermocycling protocol in <ref type="bibr">Rynearson et al. (2020)</ref> and pooled for sequencing. Thermocycling conditions were aimed at reducing sequencing bias and included an initial denaturing step at 95&#176;C for 3 min, followed by 15 cycles of 30 s each at 94&#176;C, 55&#176;C, and 72&#176;C, followed by 15 cycles of 30 s each at 94&#176;C, 70&#176;C, and 72&#176;C, and 10 min at 72&#176;C. Amplicons were cleaned with Ampure XP beads (Beckman Coulter, Inc.), followed by quantification using the Qubit Broad Range DNA Assay Kit (Thermo Fisher Scientific, Inc.), then amplified for five cycles with Nextera indices and adapters (Illumina Inc.). Samples were cleaned once more with Ampure XP beads, and PCR products were quantified with the KAPA qPCR kit (Kapa Biosystems). Products were sequenced on an Illumina MiSeq platform using v3 chemistry (2 &#215; 300 bp Illumina Inc.) at the Genomics and Sequencing Center at the University of Rhode Island.</p><p>Paired end sequencing reads were first trimmed of primers and Illumina adapters using Cutadapt (version 2.7) with an error rate for primers of 0.10 (-e option; <ref type="bibr">Martin, 2011)</ref>. The R package dada2 (version 1.14.0) was used to filter reads from each sample with the following criteria: a max error of 2, minimum truncate length before quality filtering of 250 bp forward and 225 bp reverse, and minimum final quality trimmed read length of 210 bp forward, 210 bp reverse. Dada2 was used to generate an error model of each unique sequence using learnErrors(), then error rates were used to infer true biological sequences using the dada() command. Forward and reverse reads were merged using the dada2 command mergePairs() with a minimum overlap (minOverlap) of 10 bp. Trimmed and merged sequences were grouped into amplicon sequencing variants (ASVs) using the makeSequenceTable() command to implement the Divisive Amplicon Denoising Algorithm (DADA), which can resolve differences between reads down to one nucleotide <ref type="bibr">(Callahan et al., 2016)</ref>. Finally, chimeric sequences were removed using the remove-BimeraDenovo() dada2 command with the consensus method to remove sequences that are combinations of abundant parent sequences.</p><p>Taxonomic classification of ASVs was conducted using the PR2 Database (version 4.12.0; <ref type="bibr">Guillou et al., 2012)</ref> and the na&#239;ve Bayesian classifier method with the assignTaxonomy() command implemented in dada2 <ref type="bibr">(Callahan et al., 2016)</ref>. Amplicon sequencing variants were kept separate in the statistical analyses below even if they were assigned the same species identification because the region of the 18S rRNA gene we amplified had previously been shown to predominantly recover differences among species and not strains <ref type="bibr">(Zimmermann et al., 2011)</ref>. For this reason, ASV number was always appended to species names to preserve these distinctions. Finally, sequences were checked for contaminants using the R package decontam (version 1.6.0) to remove any contaminated sequence features using a statistical classification procedure <ref type="bibr">(Davis et al., 2018)</ref>. To ensure that each sample contained sufficient reads for further statistical analyses, rarefaction curves were used to set a lower limit of 200 diatom reads per sample, eliminating three samples from further analysis (Figure <ref type="figure">S1</ref> in the Supporting Information). Two additional samples (SPS32 &amp; SPS33) contained sequentially filtered biomass over a 5&#956;m filter and then a 0.2&#956;m filter, and therefore, ASV counts from those samples were merged for downstream analysis (SPS32_m), yielding a total of 30 samples that were included in further statistical analyses.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Statistical analyses</head><p>All statistical analyses were conducted in R (version 4.04). Because amplicon sequencing yields inherently compositional data <ref type="bibr">(Gloor et al., 2017)</ref> and because rDNA gene copy number can vary by orders of magnitude among species <ref type="bibr">(Santoferrara, 2019;</ref><ref type="bibr">Zhu et al., 2005)</ref>, sequence reads were analyzed for ASV identity and presence/absence, not abundance. Relative abundances are presented only in the bar graph in Figure <ref type="figure">S2</ref> in the Supporting Information. To obtain a robust measure of ASV richness, we grouped samples from all depths and size fractions for each site. Sample sites were assigned to Sargasso Sea (SS), Gulf Stream (GS), or coastal waters (CW) based on their temperature and salinity characteristics. Regional (SS, GS, CW) differences in DIN, phosphate, and chlorophyll a concentrations were examined using ANOVA (aov function) and post-hoc Tukey's tests (tukeyHSD function) at an alpha of 0.05. Diatom ASV richness is reported as the total number of ASVs per sample. Diatom genus richness is reported as the total number of genera per sample and used to highlight divergence in taxonomy across region that might relate to trait differences. Differences in ASV number and genus richness by region were tested using an ANOVA (aov function). A linear regression of chlorophyll a concentration and ASV richness was calculated using the lm() function.</p><p>To test for variation in ASV richness by size fraction, an ANOVA (aov function) was used to test the influence of filter pore size on richness estimates. Because the results of the ANOVA were insignificant, we then grouped samples by region (rather than filter pore size) for further diversity analysis. To examine variation in cell size among species identified exclusively from coastal waters compared with offshore regions, cell volumes of diatom ASVs identified to species were obtained from LeBlanc et al. ( <ref type="formula">2012</ref>) and then separated into two size classes of cell volumes: small (&#8804;6,500 &#956;m 3 ) and large (&gt;6,500 &#956;m 3 ). The cell volume cut offs correspond to the volume of a cylindrical cell with both cell height and diameter of 20 &#956;m. The percent of ASVs in each volume range (small or large) was then calculated for each region. To test for differences among regions, depth, and filter pore size, the vegan (version 2.5-7) function ordinate() was used with the Jaccard similarity index and max iterations of 500. Differences among groups in the Jaccard similarity matrix were tested in vegan (Adonis function) following homogeneity tests (betadisper function). A Mantel test <ref type="bibr">(Mantel &amp; Valand, 1970)</ref> was used to test for differences in diatom community composition with geographic distance using the Jaccard similarity index. Finally, a transformation-based redundancy analysis (tb-RDA) was used to correlate diatom presence with environmental variables across the three regions. A subset of 27 samples out of the total 30 were included in the tb-RDA; three samples were removed because two (SPS34 &amp; SPS35, CW station S20A) did not have associated nutrient measurements and one (SPS02, SS station S1B) had an outlier dissolved organic nitrogen value (&gt;1.89 &#956;mol &#8226; L -1 ) concentration of 11.25 &#956;mol &#8226; L -1 DIN. Prior to ordinating using a tb-RDA, the presence of ASVs was first normalized with vegan (decostand function) using a Hellinger standardization to give less weight to rare ASVs only present in one sample that might have influenced multivariate analysis <ref type="bibr">(Legendre &amp; Gallagher, 2001)</ref>. To achieve normality, phosphate was log transformed, and both phosphate and temperature were standardized with z-scores followed by redundancy analysis (rda function) and significance testing using ANOVA and a holm R 2 adjustment for multiple comparisons. Silicate (SiO 2 ), DIN, phosphate, temperature, CPAR and salinity were all included in exploratory analysis, but only temperature and phosphate were significant (p &#8804; 0.05), and the final redundancy analysis only included these significant environmental variables.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RESULTS</head><p>Samples collected in the western North Atlantic (Figure <ref type="figure">1</ref>) during May 2018 were defined as originating from nearshore coastal water (CW, salinities of 32-33 and temperatures of 6-10&#176;C), and offshore regions of the Sargasso Sea (SS, salinity 37 and temperatures of 20-22&#176;C) and Gulf Stream (GS, high salinities of 36-37 and temperatures of 23-25&#176;C; Figure <ref type="figure">2a</ref>; Data S1). Compared with the Sargasso Sea and Gulf Stream, coastal water had significantly higher chlorophyll a (ANOVA, F 2,17 = 23.6, p &lt; 1.26e -5 ) and phosphate (ANOVA, F 2,27 = 70, p &lt; 2.03e -11 ) concentrations but not DIN concentrations (ANOVA, F 2,25 = 0.487, p = 0.62; Figure <ref type="figure">2b-d</ref>; Tables <ref type="table">S1</ref> and <ref type="table">S2</ref> in the Supporting Information). There were no significant differences between the Gulf Stream and Sargasso Sea (Tukey post-hoc HSD) in terms of phosphate, DIN, or chlorophyll a (Table <ref type="table">S2</ref>). Dissolved inorganic nitrogen: phosphate (N:P) ratios were largely above 16 in the Sargasso Sea and Gulf Stream and below 16 in coastal water (Figure <ref type="figure">2e</ref>).</p><p>Sequencing of the amplified rDNA yielded 657,709 sequences over 34 samples with an average of 19,344 reads per sample. On average, 21.4% &#177; 21.7% of sequence reads matched the class Bacillariophyceae in each sample (Figure <ref type="figure">S3</ref> in the Supporting Information). Thirty-one samples exceeded the minimum threshold of 200 diatom reads (class Bacillariophyceae) per sample established following a rarefaction analysis to prevent under sampling (see Methods section and Figures <ref type="table">S1</ref> and <ref type="table">S3</ref>). After merging two samples (SPS32 and SPS33, see Methods section), the final 30 samples were comprised of nine Sargasso Sea samples, 11 Gulf Stream samples, and 10 coastal water samples. A total of 301 ASVs were identified as Bacillariophyceae, and of those, 181 could be identified to genus level and 90 identified to species level (Data S1). Reads from the genera Thalassiosira, Minidiscus, and Skeletonema were abundant in coastal water, while Pseudo-nitzschia and Nitzschia reads were abundant in the offshore regions (Figure <ref type="figure">S2</ref>). In samples from both nearshore and offshore regions, a large proportion of reads (21%-87%) could not be assigned even to the genus level.</p><p>There were no regional differences in diatom richness, either at the ASV (ANOVA, F 2,27 = 0.45, p = 0.644) or genus levels (ANOVA, F 2,27 = 0.42, p = 0.659; Figure <ref type="figure">3a</ref>,<ref type="figure">b</ref>). On average, each water sample along the transect had an ASV richness of 30 &#177; 17 and genus richness of 10 &#177; 5. There was no significant relationship between diatom diversity (ASV richness) and total chlorophyll a (linear regression, R 2 &lt; 0.01, F 1,26 = 0.045, p = 0.834; Figure <ref type="figure">3c</ref>). There were no significant differences in ASV diversity with either size fraction (either &gt;0.2 or &gt;5 &#956;m, ANOVA, F 1,28 = 1.779, p = 0.193) or depth (surface or deep, ANOVA, F 1,28 = 0.084, p = 0.744; Data S1). The proportion of Bacillariophyceae ASVs that could be identified to species level was similar across regions: 70% and 65% of reads from the offshore region (Gulf Stream &amp; Sargasso Sea) and coastal water, respectively.</p><p>Of Bacillariophyceae ASVs identified, just four were shared among all three regions and of those, only one could be identified to genus and species level (ASV_69, Guinardia striata). The Sargasso Sea and Gulf Stream shared 71 ASVs including several species of Pseudo-nitzschia, Nitzschia, and Leptocylindrus (Figure <ref type="figure">4a</ref>; Figure <ref type="figure">S4</ref> in the Supporting Information). A total of 211 ASVs were found exclusively in the offshore region (70%; Sargasso Sea and Gulf Stream), while 67 ASVs were found exclusively in the nearshore coastal water (25%; Figure <ref type="figure">4a</ref>). Coastal water samples shared 10 ASVs with the Sargasso Sea samples, including those identified as Guinardia delicatula, Guinardia flaccida, and Minutocellus polymorphus, and shared none with the Gulf Stream (Figure <ref type="figure">4a</ref>; Figure <ref type="figure">S4</ref>). Of the 211 ASVs found only in the offshore region, 168 ASVs (80%) could be identified to genus level, of which 129 ASVs (77%) were within the same genus as ASVs found only in the nearshore region (Figure <ref type="figure">4</ref>). Diatom ASVs that were found only in the nearshore or offshore regions and that were identified to species level (64 of 111 ASVs identified to species level, Data S1) had different cell volumes. Amplicon sequence variants observed only in offshore regions (45 of 64 ASVs with cell volumes) were V 0 25 25.5 26 26.5 27 27.5 28 28.5 29 10 15 20 25 32 33 34 35 36 37 Salinity Potential Temperature (&#176;C) (a) 0.003 0.010 0.030 0.100 0.300 32&#176;34&#176;36&#176;38&#176;40&#176;L atitude (decimal degree) Phosphate (&#181;mol&#8226;L 1 ) Region CW GS SS (b) 0.1 0.3 1.0 3.0 32&#176;34&#176;36&#176;38&#176;40&#176;L atitude (decimal degree) Chlorophyll a (&#181;g&#8226;L 1 ) (c) 0 2 4 6 32&#176;34&#176;36&#176;38&#176;40&#176;L atitude (decimal degree) DIN (&#181;mol&#8226;L 1 ) (d) 3 10 30 100 300 32&#176;34&#176;36&#176;38&#176;40&#176;L atitude (decimal degree) N:P (e)</p><p>evenly split between the small (51%) and large (49%) size classes. In contrast, a majority of nearshore species were large (79%), falling within the &gt;6,500 &#956;m 3 cell volume range (~&gt;20 &#956;m cell size; Table <ref type="table">S3</ref> in the Supporting Information). More genera were observed found across all three regions (44%) compared with genera found only in coastal waters (14%) or offshore regions (28%; Figure <ref type="figure">4b</ref>). Of the diatom genera found only in offshore regions, 18% were found only in the larger (&gt;5 &#956;m) size fraction. While many genera were present across all regions, the number of ASVs per genus varied by region (Figure <ref type="figure">S5</ref> in the Supporting Information). For example, Pseudo-nitzschia, Nitzschia, and Bacteriastrum had higher ASV diversity in the Sargasso Sea and Gulf Stream, while Thalassiosira and Minidiscus had higher diversity in coastal water.</p><p>The Jaccard dissimilarity index was used to assess differences in community structure between samples. Community structure differed significantly with geographic region (ANOVA, F 2,27 = 4.01, p = 0.001; Figure <ref type="figure">5</ref>), revealing one cluster formed by both Gulf Stream and Sargasso Sea samples and a second cluster of coastal water samples. There were no significant differences among samples with depth (ANOVA, F 1,28 = 1.23, p = 0.161) and the effect of filter pore size could not be tested, as this set of samples did not have homogenous variance (Levene's test, F 1,28 = 12.80, p = 0.001).</p><p>Environmental variables in the nearshore (coastal water) and offshore (Sargasso Sea and Gulf Stream) regions were correlated with ASV presence/absence in a tb-RDA. Here, phosphate concentration (ANOVA, F 1,24 = 6.03, p = 0.002) and temperature (ANOVA, F 1,24 = 5.42, p = 0.002) were significantly correlated with diatom community composition in nearshore versus offshore regions (Figure <ref type="figure">6a</ref>). The first tb-RDA axis was significant (ANOVA-like permutation for redundancy analysis, RDA1, F 1,24 = 6.05, p = 0.002), explained 19.4% of the variability in the data set, and showed phosphate concentration increasing toward the coastal water and temperature increasing toward the offshore regions (Figure <ref type="figure">6a</ref>). The second axis explained 3.5% of the variability in the dataset, was not significant (ANOVA, F 1,24 = 1.09, p = 0.311), and aligned primarily with differences between the Sargasso Sea and Gulf Stream regions. Mantel tests showed that Jaccard dissimilarity values of microbial communities were significantly correlated with geographic distance (Mantel test, r = 0.36, p = 0.0007), although this correlation had weaker predictive power than the environmental variables of temperature (Mantel test, r = 0.73, p = 1e -04 ) and phosphate (Mantel test, r = 0.70, p = 1e -04 ).</p><p>Several diatom species contributed significantly to differences in community composition, which separated nearshore and offshore regions within the RDA (redundancy analysis, p &lt; 0.05; Figure <ref type="figure">6b</ref>). Of the ASVs with the largest associations with the two major axes of variation (axis loadings &gt; 0.14), one set characterized nearshore coastal waters and included Chaetoceros sp., Minidiscus trioculatus (2 ASVs), Pseudo-nitzschia delicatissima, Skeletonema marinoi, Thalassiosira rotula, and Thalassiosira sp. (three ASVs). The presence of these species was significantly, positively correlated with higher phosphate concentrations (ANOVA permutation for redundancy analysis, F 1,24 = 6.03, p = 0.002) and associated with lower temperatures (Figure <ref type="figure">6b</ref>). A second set of ASVs characterized offshore regions and included Nitzschia sp., Haslea sp., Pseudo-nitzschia spp., and Leptocylindrus convexus.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>D ISCUSSION</head><p>Here we examined diatom 18S rDNA gene sequence variation along a transect in the western North Atlantic from high-nutrient coastal waters to the low-nutrient Sargasso Sea in order to expand observations of diatom diversity in oligotrophic waters and understand variations in community composition with environmental variables. Diatom ASVs were detected at every station along the transect from high-nutrient coastal waters to oligotrophic open ocean waters. Although higher nutrient coastal waters are most often associated with flourishing diatom communities <ref type="bibr">(Bracher et al., 2009;</ref><ref type="bibr">Smetacek, 2012)</ref>, the highest number of both ASVs and genera were observed in the oligotrophic Sargasso Sea. Furthermore, there was no significant relationship between ASV richness and chlorophyll a concentration, supporting recent work that observed relatively high diatom richness even in low chlorophyll a waters <ref type="bibr">(Fontaine &amp; Rynearson, 2023)</ref>.</p><p>Along the transect, ASV richness estimates were likely reasonable but incomplete reflections of actual diatom species diversity. On the one hand, intraindividual and intra-specific variation in the 18S rDNA gene could lead to overestimations of species richness (reviewed in <ref type="bibr">Santoferrara, 2019)</ref>. However, these differences are not resolved well using the 18S rDNA V4 region we amplified <ref type="bibr">(Zimmermann et al., 2011)</ref>, and thus diatom species richness is unlikely to be strongly overestimated by this variation. On the other hand, the rDNA region we amplified distinguishes among many, but not all, diatom species <ref type="bibr">(Zimmermann et al., 2011)</ref>, leading to underestimates of diatom diversity. This is a known issue for species that share identical sequences at the V4 rDNA region and cannot be distinguished from each other, including species in frequently occurring and ecologically important diatom genera observed in our dataset (e.g., Pseudo-nitzschia, <ref type="bibr">Ruggiero et al., 2022;</ref><ref type="bibr">Skeletonema, Luddington et al., 2012;</ref><ref type="bibr">Thalassiosira, Rynearson et al., 2020)</ref>.</p><p>Overall, diatom richness, whether defined as the total number of ASVs or total genera per sample, did not change significantly from coastal to offshore waters, supporting previous studies that observed a comparable pattern <ref type="bibr">(Caputi et al., 2019;</ref><ref type="bibr">Malviya et al., 2016)</ref>. Notably, this pattern differs from two studies that observed lower diatom diversity in oligotrophic waters <ref type="bibr">(Busseni et al., 2020;</ref><ref type="bibr">James et al., 2022)</ref>, which may be explained by a variety of underlying causes. First, patterns of diversity may differ among locations due to oceanographic conditions. For example, the decreasing nearshore-offshore gradient of Shannon diversity 92 52 67 76 10 0 4 (a) 2 3 5 5 4 1 16 Coastal Water Gulf Stream Sargasso Sea (b)</p><p>F I G U R E 5 Non-metric multidimensional scaling of Jaccard dissimilarity of diatom community composition separated by region. The Jaccard index was significantly different across the three regions (F 2,27 = 4.01, p = 0.001) and met the assumption of homogeneity (F 2,27 = 0.13, p = 0.881).</p><p>observed in the Southern California Current <ref type="bibr">(James et al., 2022)</ref> was related to nitracline depth that was influenced by variations in regional upwelling. Second, different diversity metrics, such as species richness and the Shannon-Weiner index, reflect different aspects of community composition and are not directly comparable <ref type="bibr">(Santini et al., 2017)</ref>. Finally, patterns of diversity obtained may be related to the size range of diatoms examined, which varies among studies, including those focused on diatoms &gt;20 &#956;m in diameter <ref type="bibr">(Busseni et al., 2020)</ref>. Here, 29% of the ASVs that could be identified to the species level had reported cell diameters of &lt;20 &#956;m. The prevalence of small diatoms in our study and the studies of others <ref type="bibr">(James et al., 2022;</ref><ref type="bibr">Malviya et al., 2016)</ref> suggests that in the future, diatoms should be harvested on &#8804;3 &#956;m filters in both oligotrophic F I G U R E 6 Transformation-based redundancy analysis of diatom community composition highlighting environmental variables (a) and species (b) that best explain the relationship among water samples. Phosphate concentration (F 1,24 = 6.03, p = 0.002) and temperature (F 1,24 = 5.42, p = 0.002) were the only significant environmental variables (asterisk). Inset from gray box (a) expanded in B and showing only species with axis loadings greater than 0.14. Only axis RDA 1 was statistically significant (RDA1, F 1,24 = 6.05, p = 0.002). Surface samples corresponded to depths of 25 m (Sargasso Sea &amp; Gulf Stream) or 5-15 m (coastal water) above the deep chlorophyll maximum (circles), and deep samples corresponded to depths taken at the deep chlorophyll maximum or below (40-80 m in SS &amp; GS; 25 m in coastal water; triangles).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Species</head><p>*Phosphate *Temperature -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 RDA1 (19.4%) RDA2 (3.5%) Depth Deep Surface Region Coastal Water Gulf Stream Sargasso Sea (a) ASV 14 Skeletonema marinoi ASV 29 Minidiscus trioculatus ASV 53 Thalassiosira sp. ASV 63 Pseudo-nitzschia delicatissima ASV 73 Minidiscus trioculatus ASV 108 Thalassiosira rotula ASV 112 Thalassiosira sp. ASV 211 Thalassiosira sp. ASV 612 Chaetoceros sp. ASV 2 Pseudo-nitzschia sp. ASV 13 Nitzschia sp. ASV 85 Nitzschia sp. ASV 134 Pseudo-nitzschia delicatissima ASV 215 Pseudo-nitzschia sp. ASV 234 Leptocylindrus convexus ASV 245 Haslea sp. p p A A A S S S Species -0.4 -0.2 0.0 0.2 0.4 -0.50 -0.25 0.00 0.25 0.50 RDA1 (19.4%) RDA2 (3.5%) (b)</p><p>and eutrophic waters to capture the full size spectrum of diatom species and allow for further investigation of diversity patterns across oceanographic regions.</p><p>Although diatom richness was indistinguishable across regions, there were significant differences in community composition. Dissimilarities in community composition were significantly correlated with geographic distance, although the predictive power of that relationship was only about half that of the environmental variables of phosphate concentration and temperature, supporting experimental work that has documented swift species sorting in response to nutrient availability and temperature <ref type="bibr">(Anderson et al., 2022)</ref>. Surface waters of the western Sargasso Sea are low in phosphate relative to other ecosystems <ref type="bibr">(Cavender-Bares et al., 2001;</ref><ref type="bibr">Lomas et al., 2010;</ref><ref type="bibr">Wu, 2000)</ref>, and N : P ratios in the offshore samples were well above the Redfield ratio <ref type="bibr">(Redfield, 1958)</ref>. Phytoplankton communities in the western Sargasso Sea frequently exhibit evidence of phosphorus stress, such as phosphorus sparing, and the utilization of organic phosphorus sources <ref type="bibr">(Lomas et al., 2004</ref><ref type="bibr">(Lomas et al., , 2010;;</ref><ref type="bibr">Van Mooy et al., 2009)</ref>. Oligotrophic diatoms may also have novel low-nutrient adaptations, such as buoyancy control and migration to the nutricline <ref type="bibr">(Arrieta et al., 2020;</ref><ref type="bibr">Falciatore &amp; Bowler, 2002)</ref>. Stress responses to phosphorus limitation can be taxon-specific <ref type="bibr">(Lomas et al., 2004)</ref>, and it may be that different diatom phosphorus stress adaptations <ref type="bibr">(Dell'Aquila et al., 2020;</ref><ref type="bibr">Dyhrman et al., 2012)</ref> play a role in the significant correlation observed between phosphate concentrations and diatom community composition in this study. Community composition also varied significantly with temperature, and temperature gradients, such as the 18&#176;C range observed in this study, are expected to select for taxa with differing thermal tolerances <ref type="bibr">(Anderson et al., 2021;</ref><ref type="bibr">Anderson &amp; Rynearson, 2020;</ref><ref type="bibr">Thomas et al., 2012)</ref>. Diatoms surviving in relatively warmer, oligotrophic waters likely have a higher thermal maximum than coastal taxa of the colder northeast shelf, a factor known to contribute to diatom community structure and distribution <ref type="bibr">(Anderson et al., 2021;</ref><ref type="bibr">Thomas et al., 2012)</ref>. Several ASVs were responsible for explaining variation between offshore and nearshore community structure. For example, ASVs in the genera Pseudo-nitzschia, Nitzschia, and Leptocylindrus were positively correlated with both the lower phosphate concentrations and higher temperatures of the offshore region. In contrast, ASVs in the genera Thalassiosira, Chaetoceros, and Skeletonema were positively correlated with higher phosphate concentrations and lower temperatures of the nearshore region.</p><p>Although individual ASVs were associated with particular environments, over half of the genera in this dataset were present in both oligotrophic and eutrophic environments, suggesting that closely related species within the same genus can have distinct adaptations to either environment. For example, some species within the cosmopolitan genera Rhizosolenia and Chaetoceros are known to form large blooms in eutrophic coastal waters (Chaetoceros socialis, <ref type="bibr">Booth et al., 2002;</ref><ref type="bibr">and Rhizosolenia delicatula, Sournia et al., 1987)</ref>, while other species in these same genera survive in oligotrophic waters by fulfilling their nitrogen requirements through symbioses with nitrogen-fixing cyanobacteria (Rhizosolenia clevei, <ref type="bibr">Villareal, 1989;</ref><ref type="bibr">and Chaetoceros compressus, G&#243;mez et al., 2005)</ref>. Furthermore, up to 11% of all ASVs found in this study could potentially be associated with nitrogen-fixing cyanobacteria <ref type="bibr">(Foster &amp; Zehr, 2019)</ref>, and many of those are known to have congeners that have no known associations with diazotrophs. A number of adaptations to both low N and P have been identified in diatoms from nutrientrich coastal waters, including luxury nitrate uptake <ref type="bibr">(Dortch, 1982;</ref><ref type="bibr">Dortch et al., 1984)</ref> and phospholipid substitution <ref type="bibr">(Dyhrman et al., 2012;</ref><ref type="bibr">Martin et al., 2011;</ref><ref type="bibr">Zhang et al., 2022)</ref>. However, studies of diatom nutrient physiology are rarely done with isolates from oligotrophic regions like the Sargasso Sea. Our data suggest that many of the oligotrophic species we identified are members of the same genera as well-studied eutrophic species, highlighting the value of evaluating adaptive responses in diatom isolates from warm, oligotrophic regions as they may carry different traits than isolates from colder, more nutrient-rich waters.</p><p>This dataset provided the opportunity to investigate whether shifts in community composition between oligotrophic and eutrophic waters were associated with shifts in species with different cell sizes. Cell volumes for species found in our study suggest less than a quarter of the diatom community in coastal waters was comprised of small cells compared to half of the diatom community in oligotrophic regions. Due to their high surface-area-to-volume ratios, nutrient uptake kinetics in small-celled species are advantageous in oligotrophic regions <ref type="bibr">(Finkel et al., 2010)</ref>. Recent work has pointed to the ecological and biogeochemical importance of small diatoms <ref type="bibr">(Leblanc et al., 2018)</ref>, particularly in the offshore waters of the western North Atlantic during the spring bloom <ref type="bibr">(Bola&#241;os et al., 2020)</ref>. Although they are small, diatoms like Minidiscus can achieve rapid sinking rates primarily via cell aggregation, and thus can contribute significantly to export production <ref type="bibr">(Leblanc et al., 2018;</ref><ref type="bibr">Richardson &amp; Jackson, 2007)</ref>. Notably, small diatoms are also being increasingly recognized as ecologically important inhabitants in high-nutrient waters <ref type="bibr">(Arsenieff et al., 2020;</ref><ref type="bibr">Rynearson et al., 2020)</ref>, waters in which we also identified numerous small-celled species, including those in the genera Minidiscus as well as P. delicatissima and Thalassiosira minima. Although analysis of cell volumes was limited to those ASVs identified to species level and whose cell size measurements were included in the database compiled by <ref type="bibr">Leblanc et al. (2018)</ref>, patterns of cell sizes emerged between offshore and nearshore regions and provided an initial examination of cell size differences across nutrient gradients in the western North Atlantic.</p><p>Diatom abundance in the Sargasso Sea changes throughout an annual cycle <ref type="bibr">(Hulburt et al., 1960;</ref><ref type="bibr">Steinberg et al., 2001)</ref>, and the diatom community we identified represents a snapshot in time. Although taxonomic investigations of species composition in the Sargasso Sea are rare, the most species-rich time of year appears to be mid-April, just 2 weeks before our samples were collected, and studies have observed five species found in our study; Bacteriastrum hyalinum, Chaetoceros decipiens, Guinardia flaccida, Leptocylindrus danicus, and Thalasionema frauenfeldii <ref type="bibr">(Hulburt et al., 1960;</ref><ref type="bibr">Hulburt &amp; Rodman, 1963)</ref>. The persistence of these diatom species over decades suggest they are long-term inhabitants of the oligotrophic Sargasso Sea. More broadly, we recovered 36 genera, including 13 genera identified in <ref type="bibr">Hulburt et al. (1960)</ref> and <ref type="bibr">Hulburt and Rodman (1963)</ref>. Importantly, these historical studies reported that small centric diatoms were the most abundant cells at depth during the April bloom, but they were too small to be identified to species level. Our work suggests these may have included the species Minidiscus trioculatus, Minutocellus polymorphus, or Thalassiorsira minima (for full species list see Table <ref type="table">S4</ref> in the Supporting Information). Following the April bloom, species diversity and abundance in surface waters were low throughout the remainder of the year <ref type="bibr">(Hulburt et al., 1960)</ref>, an observation supported by pigment data from the Bermuda Atlantic Time Series <ref type="bibr">(Steinberg et al., 2001)</ref>. Intermittent diatom-dominated blooms have been observed at other times of year in response to eddy-driven nutrient inputs <ref type="bibr">(Allen et al., 2005;</ref><ref type="bibr">Krause et al., 2010;</ref><ref type="bibr">McGillicuddy et al., 2007)</ref>. Our work suggests that potential seed populations for such blooms contain high diversity, with the majority of all ASVs identified here (70%) originating exclusively from oligotrophic waters.</p><p>Only a small percentage of ASVs (5%) were found in both nearshore and offshore regions and included several unidentified Bacillariophyceae, three species of Guinardia (G. striata, G. flaccida, and G. delicatula), Minutocellus polymorphus, Cerataulina pelagica, and Chaetoceros costatus, among others. Species in the genera Minutocellus and Chaetoceros are cosmopolitan <ref type="bibr">(De Luca et al., 2019;</ref><ref type="bibr">Leblanc et al., 2018;</ref><ref type="bibr">Malviya et al., 2016)</ref> and able to survive across different ocean environments. In addition, C. pelagica has been found in a variety of environmental conditions, including both high-nutrient coastal and oligotrophic waters <ref type="bibr">(Brun et al., 2015;</ref><ref type="bibr">Carstensen et al., 2015)</ref>. The ability of certain diatom species to survive across disparate environments may relate to their metabolic plasticity and the flexibility of gene regulation reflecting a generalist strategy <ref type="bibr">(Glibert, 2016;</ref><ref type="bibr">Margalef, 1978</ref><ref type="bibr">Margalef, , 1979))</ref>. Alternatively, there are also well-known examples of ecologically distinct diatom species with identical 18S V4 rDNA regions <ref type="bibr">(Cerino et al., 2005;</ref><ref type="bibr">Luddington et al., 2012)</ref>, highlighting that some of the observed cosmopolitan ASVs may in fact represent different species along the nutrient and temperature gradients we sampled. The percentage of reads that could not be identified even to the genus level was 40%, consistent with previous work examining diatom diversity <ref type="bibr">(Malviya et al., 2016)</ref> and highlighting large gaps in our ability to connect ASVs and their patterns of occurrence with species-specific or even genus-specific physiology.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>CONCLUSIONS</head><p>Understanding the diversity and distribution of diatoms is important in establishing their contribution to nutrient cycling and carbon export, considering that diatoms account for ~20% of global carbon cycling <ref type="bibr">(Falkowski, 1998;</ref><ref type="bibr">Field, 1998;</ref><ref type="bibr">Tr&#233;guer et al., 2018)</ref> yet are understudied in open ocean oligotrophic regions that constitute a majority of the global ocean <ref type="bibr">(Busseni et al., 2020;</ref><ref type="bibr">Malviya et al., 2016)</ref>. Across this transect, just one-third of the ASVs could be identified to the species level (30%). This speaks to the need for continued work to connect described species with their rDNA gene sequences as well as for additional taxonomic and physiological research to identify and interrogate the metabolic capabilities of new species (e.g., <ref type="bibr">Luddington et al., 2012)</ref>. Strikingly, the percentage of reads that could be identified to species level was similar across the transect from nearshore to offshore, indicating no substantial difference in the ability to assign taxonomy to diatoms from the two habitats.</p><p>Our study determined that oligotrophic regions of the western North Atlantic supported taxonomically diverse diatom communities that were distinct from those in coastal waters. Resource partitioning through niche differentiation and competition between species likely facilitate this diversity, especially in oligotrophic environments where there is competition for scarce resources <ref type="bibr">(Huisman &amp; Weissing, 1999;</ref><ref type="bibr">Kerr et al., 2002;</ref><ref type="bibr">Menden-Deuer &amp; Rowlett, 2014;</ref><ref type="bibr">Tilman et al., 1982)</ref>. This species diversity might also allow for resilience in the face of environmental change through a diversity of physiological capacity or through functional redundancy that allows one diatom species to take the place of another without driving shifts in ecosystem functions like primary production or carbon export <ref type="bibr">(Tilman et al., 2014)</ref>. Given that oligotrophic regions are predicted to expand in coming years <ref type="bibr">(Polovina et al., 2008)</ref>, understanding patterns of diversity in lownutrient, open ocean regions is important for projecting diatom diversity and resilience in the future ocean and recognizing the implications this has for global biogeochemical cycling.</p></div></body>
		</text>
</TEI>
