<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Hydrocarbon metabolism and petroleum seepage as ecological and evolutionary drivers for &lt;i&gt;Cycloclasticus&lt;/i&gt;</title></titleStmt>
			<publicationStmt>
				<publisher>The ISME Journal</publisher>
				<date>12/18/2024</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10568761</idno>
					<idno type="doi">10.1093/ismejo/wrae247</idno>
					<title level='j'>The ISME Journal</title>
<idno>1751-7362</idno>
<biblScope unit="volume"></biblScope>
<biblScope unit="issue"></biblScope>					

					<author>Eleanor C Arrington</author><author>Jonathan Tarn</author><author>Veronika Kivenson</author><author>Brook L Nunn</author><author>Rachel M Liu</author><author>Blair G Paul</author><author>David L Valentine</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Aqueous-soluble hydrocarbons dissolve into the ocean’s interior and structure deep-sea microbial populations influenced by natural oil seeps and spills. n-Pentane is a seawater-soluble, volatile compound abundant in petroleum products and reservoirs and will partially partition to the deep-water column following release from the seafloor. In this study, we explore the ecology and niche partitioning of two free-living Cycloclasticus strains recovered from seawater incubations with n-pentane and distinguish them as an open ocean variant and a seep-proximal variant, each with distinct capabilities for hydrocarbon catabolism. Comparative metagenomic analysis indicates the variant more frequently observed further from natural seeps encodes more general pathways for hydrocarbon consumption, including short-chain alkanes, aromatics, and long-chain alkanes, and also possesses redox versatility in the form of respiratory nitrate reduction and thiosulfate oxidation; in contrast, the seep variant specializes in short-chain alkanes and relies strictly on oxygen as the terminal electron acceptor. Both variants observed in our work were dominant ecotypes of Cycloclasticus observed during the Deepwater Horizon disaster, a conclusion supported by 16S rRNA gene analysis and read-recruitment of sequences collected from the submerged oil plume during active flow. A comparative genomic analysis of Cycloclasticus across various ecosystems suggests distinct strategies for hydrocarbon transformations among each clade. Our findings suggest Cycloclasticus is a versatile and opportunistic consumer of hydrocarbons and may have a greater role in the cycling of sulfur and nitrogen, thus contributing broad ecological impact to various ecosystems globally.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Introduction</head><p>Much of the petroleum entering the ocean annually is introduced near the seafloor from human-caused incidents such as pipeline ruptures, well blowouts, and leaking submerged oil tankers, alongside other deep hydrocarbon inputs originating from natural oil seepage and hydrothermal vents. Following petroleum release to the seafloor, several compounds dissolve into seawater due to their aqueous solubility, subsequently affecting the microbial community within the surrounding water column <ref type="bibr">[1]</ref><ref type="bibr">[2]</ref><ref type="bibr">[3]</ref>. These semi-aqueous soluble compounds can be overlooked as drivers of microbial metabolism in the deep community because these compounds often evaporate from surface oil slicks exposed to the atmosphere, which receive the majority of attention from agencies and scientists responding to oil-related incidents. This work focuses on the semi-aqueous-soluble compound n-pentane, which is known to partition to the deep ocean following release from the seafloor <ref type="bibr">[2,</ref><ref type="bibr">3]</ref>.</p><p>Petroleum exposure to seawater substantially decreases prokaryotic diversity due to a strong selection for hydrocarbon-degrading microorganisms and toxic effects on other taxa <ref type="bibr">[4,</ref><ref type="bibr">5]</ref>. Models of in-situ hydrocarbon biodegradation indicate that as a water parcel encounters a hydrocarbon source, a seed population of hydrocarbon degraders grows</p><p>abundantly <ref type="bibr">[6]</ref><ref type="bibr">[7]</ref><ref type="bibr">[8]</ref><ref type="bibr">[9]</ref><ref type="bibr">[10]</ref><ref type="bibr">[11]</ref>. The origin and ecology of these seed populations are primarily hypothetical. Many studies have suggested seed populations are prolonged and sustained by hydrocarbon substrates originating from various sources, including hydrothermal vents <ref type="bibr">[12]</ref>, cyanobacteria, and eukaryotic phytoplankton <ref type="bibr">[12]</ref><ref type="bibr">[13]</ref><ref type="bibr">[14]</ref>, as well as natural gas seepage <ref type="bibr">[15,</ref><ref type="bibr">16]</ref>. As an example, the ubiquitous alkane degrader, Alcanivorax, exhibits basal cell populations that range from 10 to 5,000 cells per mL in uncontaminated seawater <ref type="bibr">[6,</ref><ref type="bibr">17]</ref>,</p><p>with recent work suggesting high native abundance is subsidized by widespread biosynthesis of long-chain n-alkanes by marine phytoplankton <ref type="bibr">[13,</ref><ref type="bibr">18]</ref>. Other recent evidence has shown that methanotrophs can be physically transported on bubbles from a gas seep <ref type="bibr">[19,</ref><ref type="bibr">20]</ref>, pointing to seeps as a physical mechanism to seed the water column with hydrocarbon degraders. Alternatively, facultative hydrocarbon degraders could be present that rely on other metabolic inputs such as amino acids, carbohydrates, lipids, or other organic acids and switch to hydrocarbons under appropriate conditions <ref type="bibr">[21]</ref><ref type="bibr">[22]</ref><ref type="bibr">[23]</ref>. Very few studies have focused on how these factors control the development of a petroleumdegrading community during oil spills in previously uncontaminated waters.</p><p>Our investigation <ref type="bibr">[18]</ref> of the ocean's biological hydrocarbon cycle revealed the microbial response to n-pentane is structured by proximity to seepage (Fig. <ref type="figure">1</ref>). This previous work and our current study focus on sea-going incubations conducted with water collected from the deep ocean (1,000 m) along a transect spanning the Gulf of Mexico (GOM) and the North Atlantic. n-Pentane metabolism was observed through a closedsystem optical oxygen technique. Blooms were designated as present when three consecutive time points exhibited oxygen loss greater than 0.21 &#181;M h -1 after normalization to unamended controls. This definition was based on the finding that each incubation that met this threshold continued to see oxygen decline at this rate or greater until reaching near hypoxic conditions, suggesting a bloom-like state. We observed distinct bloom response times to n-pentane in relation to natural seepage, whereby bloom onset on</p><p>pentane is ~9X faster in the seep-ridden Northwest Gulf of Mexico compared to the water underlying the North Atlantic subtropical gyre <ref type="bibr">[18,</ref><ref type="bibr">24]</ref>. Median bloom times varied from 72.9 days furthest from natural seepage to 8.3 days closest to natural seepage. The fraction of samples that exhibited a bloom response also coincided with proximity to natural seepage, with 100% of incubations blooming within 30 days near seepage, 33% blooming further from seepage in the northeastern Gulf of Mexico, and 0% of incubations blooming outside the Gulf within that same timeframe (Fig. <ref type="figure">1b</ref>). While this previous work focused on the timing and occurrence of respiratory blooms on n-pentane, the study did not explore which microorganisms were responsible for n-pentane consumption or how natural seepage influences which microorganism responds in each of these settings.</p><p>Here, we investigate the influence of biogeography on microbial hydrocarbon metabolism by analyzing the genomic content of organisms with contrasting bloom responses and source water origins within the Gulf of Mexico (GOM). We extract and analyze blooms' 16S rRNA gene content, assemble high-quality draft metagenome assembled genomes (MAGs) from bloom experiments, and perform a complementary proteomic analysis to evaluate metabolism. The predominant microbial member of the npentane-enriched community across the Gulf of Mexico belongs to the Cycloclasticus genus, initially named for their metabolic capability to consume polycyclic aromatic hydrocarbons (PAHs). Cycloclasticus belongs to the class Gammaproteobacteria and is often detected to be abundant in oil-rich oceanic regions <ref type="bibr">[25]</ref><ref type="bibr">[26]</ref><ref type="bibr">[27]</ref>. Cycloclasticus has also been found as a symbiont in the tissues of mussels and sponges at deep-sea oil seeps, where they likely metabolize short-chain alkanes <ref type="bibr">[15,</ref><ref type="bibr">[26]</ref><ref type="bibr">[27]</ref><ref type="bibr">[28]</ref><ref type="bibr">[29]</ref>. All cultures of Cycloclasticus have been isolated on aromatic substrates and are closely related according to 16S rRNA gene phylogeny <ref type="bibr">[15,</ref><ref type="bibr">25,</ref><ref type="bibr">[30]</ref><ref type="bibr">[31]</ref><ref type="bibr">[32]</ref>. We observe two strains of Cycloclasticus bloom in response to short-chain alkanes in the Gulf of Mexico, with one strain favoring the seepinfluenced region of the northwestern GOM (hereafter referred to as the seep variant, SV).</p><p>In contrast, the other strain favors the open ocean region far from natural seepage (hereafter referred to as the open ocean variant, OOV).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Materials and Methods</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Incubation design and sample collection</head><p>Seawater samples were collected on two research cruises aboard RV Atlantis in June 2015 and the RV Neil Armstrong in May 2017. n-Pentane incubations were conducted at stations 1 (40&#176; 9.14&#697; N, 68&#176; 19.889&#697; W), 2 (33&#176; 58.21&#697; N, 69&#176; 43.38&#697; W), 3 (27&#176; 30.41&#697; N, 87&#176; 12.41&#697; W), 4 (27&#176; 15.00&#697; N, 89&#176; 05.05&#697; W), 5 (27&#176; 11.60&#697; N, 90&#176; 41.75&#697; W) and 6 (27&#176; 38.40&#697; N, 90&#176; 54.98&#697; W) with seawater collected from 1,000 m. Respiration data and methods are available from <ref type="bibr">[18]</ref>, with sample numbers re-named for this study to exclude irrelevant data. Seawater collected from the CTD Niskin bottles was transferred to 250 mL glass serum vials using a small length of Tygon tubing. Vials were overflowed for at least 3 volumes of water and no air bubbles were present before sealing with polytetrafluoroethylene (PTFE) coated chlorobutyl rubber stopper and crimp cap seal. All bottles, except for unamended blank controls, immediately received 10 &#61549;L of n-pentane using a gas-tight syringe (Hamilton) and were maintained in the dark at in-situ temperature (4 &#186;C). Before filling, each serum bottle was fixed with a contactless optical oxygen sensor (Pyroscience) on the inner side with silicone glue, and afterward were cleaned from organic contaminants with triple rinses of ethanol, 3% hydrogen peroxide, 10% hydrochloric acid, and MilliQ water, and were sterilized via autoclave. Oxygen concentration was monitored approximately every 8 hours with a fiber optic oxygen meter (Pyroscience). Observed changes in oxygen content were normalized to unamended controls to correct for oxygen loss from background respiration processes and variability due to temperature changes.</p><p>After 27-30 days, each incubation was sacrificially harvested, samples were collected for nutrient analysis and cell count analysis, and the remaining seawater was filtered on a 0.22 &#181;m polyethersulfone filter and stored at -80&#176;C until further analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Nutrient and cell enumeration</head><p>Before filtration, seawater was collected from incubations for cell enumeration via flow cytometry and nutrient analysis. 2 mL subsamples for prokaryotic cell abundance were fixed with 0.2% paraformaldehyde and quantified using the Millipore Guava EasyCyte 5HT flow cytometer as in <ref type="bibr">[33]</ref>. The dissolved nutrient (nitrate, phosphate, and ammonia) sample collection was conducted following the requirements of the University of California, Santa Barbara Marine Science Institute Analytical Lab. Seawater incubation samples were filtered through a 0.2 &#956;m polyvinylidene (PVDF) filter into triple-rinsed plastic HDPE 20 mL vials.</p><p>Nutrient sample volumes were &#8764;17 mL water and stored frozen at -20&#176;C until analysis.</p><p>Dissolved nutrient concentrations were analyzed by flow injection analysis (FIA) using the QuikChem 8500 Series 2 (Lachat Instruments).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Deepwater Horizon archival sample</head><p>We extracted and analyzed two replicate archive environmental DNA samples collected from the Deepwater Horizon event on May 30th, 2010 from a depth of 1,090 m while the wellhead was still leaking into the Gulf of Mexico. Microbial biomass was filtered onto 0.2-micron Sterivex filters (Millipore) and stored at -80&#176;C until further analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>DNA extraction, PCR amplification, and 16S rRNA gene analysis</head><p>DNA was analyzed from stations within the GOM and the DWH archival samples.</p><p>DNA extraction was performed from &#188; of each filter using the PowerSoil DNA extraction kit with the following modifications: 200 &#181;l of bead beating solution was removed at the initial step and replaced with phenol-chloroform, the C4 bead binding solution was supplemented with 600 &#181;L of 100% ethanol, and we added an additional column washing step with 650 &#181;L of 100% ethanol. Extracts were purified and concentrated by ethanol precipitation, then stored at -80 &#186;C. The V4 region of the 16S rRNA gene was amplified, and each sample was barcoded as previously desc <ref type="bibr">[34]</ref> with the 515F-Y and 806RB primers as previously published <ref type="bibr">[35]</ref><ref type="bibr">[36]</ref><ref type="bibr">[37]</ref>. Amplicon PCRs contained 1 &#181;L of template DNA, 2 &#181;L of forward primer,</p><p>2 &#181;L of reverse primer, and 17 &#181;L of AccuPrime Pfx SuperMix. Thermocycling conditions consisted of 95&#176;C 2 min, 30 cycles of 95&#176;C for 20 secs, 55&#176;C for 15 secs, 72&#176;C for 5 min, and a final elongation at 72&#176;C for 10 min. Sample DNA concentrations were normalized using the SequelPrep Normalization Kit, cleaned using the DNA Clean and Concentrator kit, visualized on an Agilent Tapestation, and quantified using a Qubit Fluorometer. Samples were sequenced at the UC Davis Genome Center on the MiSeq platform (Illumina) with 250 nucleotide paired-end reads. A PCR-grade water sample was included in extraction, amplification, and sequencing as negative control to assess for DNA contamination.</p><p>Trimmed fastq files were quality filtered using the fastqPairedFilter command within the dada2 R package, version 1.9.3 <ref type="bibr">[38]</ref> with following parameters: truncLen=c(190,190), maxN=0, maxEE=c(2,2), truncQ=2, rm.phix=TRUE, compress=TRUE, multithread=TRUE.</p><p>Quality filtered reads were dereplicated using derepFastq command. Paired dereplicated fastq files were joined using the mergePairs function with the default parameters. A amplicon sequence variant (ASV) table was constructed with the makeSequenceTable command, and potential chimeras were removed denovo using removeBimeraDenovo.</p><p>Taxonomic assignment of the sequences was done with the assignTaxonomy command using the Silva taxonomic training dataset formatted for DADA2 v132 <ref type="bibr">[39,</ref><ref type="bibr">40]</ref>. If sequences were not assigned to the Silva database, they were left as NA.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Metagenomic sequencing and reconstruction</head><p>Metagenomic library preparation and high-throughput sequencing were conducted at the University of California Davis DNA Technologies Core. DNA was sequenced on the HiSeq4000 (Illumina) platform, producing 150-base pair (bp) paired-end reads with a targeted insert size of 400 bp. Quality control and adaptor removal were performed with Trimmomatic <ref type="bibr">[41]</ref> (v.0.36; parameters: leading 10, trailing 10, sliding window of 4, quality</p><p>score of 25, minimum length 151 bp) and Sickle <ref type="bibr">[42]</ref> (v.1.33 with paired-end and Sanger parameters).</p><p>10-70% of the trimmed high-quality reads were randomly subsampled to deconvolute assembly and downstream binning in samples with very high coverage, as in <ref type="bibr">[43]</ref>. Subsamples of each metagenomic dataset were tested in increments of 10% to determine which percentage produced the highest quality Cycloclasticus MAG based on completion, redundancy, and number of scaffolds in the genome. The exception is sample "Cycloclasticus_sp_3_C5_1", which was tested in increments of 5% subsampled reads to reduce the number of scaffolds in the MAG to 1. The final subsampled percentage for each sample is noted in Supplementary Dataset S4. The program dRep was used to dereplicate all MAGs created from each subsampled dataset using default parameters, which group genomes based on initial 90% MASH (MinHash distance) clustering and the 95% average nucleotide identity <ref type="bibr">[44]</ref>. Only one dereplicated MAG was recovered from each sample except the DWH sample.</p><p>The subsampled high-quality reads were assembled using metaSPAdes The Genome Taxonomy Database <ref type="bibr">[53]</ref> (<ref type="url">https://data.ace.uq.edu.au/public/gtdb/data/releases/release89/89.0/</ref>, v.r89). The average</p><p>nucleotide identity of each genome was determined with the ANI Matrix via the Enveomics tool collection <ref type="bibr">[54]</ref>.</p><p>We reconstructed high-quality metagenome assembled genomes (MAGs) from five pentane bloom samples, with completeness &gt;97% and redundancy &lt;2% (black stars in Fig <ref type="figure">2</ref>). Three MAGs, named "6_C5_1", "6_C5_2", and "6_C5_3", originated from station 6</p><p>(natural seep region), and two MAGs, named "3_C5_1" and "3_C5_2", originated from station 3 (open ocean region). Based on the 16S rRNA gene analysis of the two DWH samples from this study, two variants of Cycloclasticus were present in the sample sequenced for metagenomics. For the DWH metagenome, the second variant related to OOV could not be recovered with metagenomics due to issues with assembly fragmentation and binning of the two closely related strains. To obtain a high-quality draft MAG of "MAG_DWH_1", we subsampled our reads by 50% until the OOV-related sequences were a small fraction of the assembled data.</p><p>The program dRep was used to dereplicate all MAGs reconstructed in this study using default parameters, which group genomes based on initial 90% MASH (MinHash distance) clustering and the 95% average nucleotide identity <ref type="bibr">[44]</ref>, which created two clusters of essentially identical genomes with "6_C5_1", "6_C5_2", and "6_C5_3" in one cluster and "3_C5_1" and "3_C5_2" in the other cluster. We also tested whether automated binning with MetaBAT2 <ref type="bibr">[55]</ref> would alter the resulting MAGs but found that dRep also clustered MAGs derived from the two different binning methods as essentially identical. We further note each MAG was recovered from biologically independent incubations, yet every component of metabolism and taxonomic marker analyzed was nearly identical within each ecotype variant; therefore, we will refer to "SV-MAG" as the three MAGs from station 6 and "OOV-MAG" as the two MAGs from station 3. We also downloaded single amplified genomes (SAGs) from the Joint Genome Institute that originated from the Deepwater Horizon event under the accession numbers 2599185270, 2599185276, 2599185294.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Other metagenomic reconstructions</head><p>Using a 16S rRNA gene search tool through the Joint Genome Institute-Integrated Microbial Genomes and Microbiomes (JGI-IMG) portal, we identified public environmental metagenomic datasets with Cycloclasticus representation. These datasets were downloaded, and metagenomic reconstruction was performed according to the above protocol with the following modifications: binning was performed using the automated binning software MetaBAT2 <ref type="bibr">[55]</ref>. Each Cycloclasticus MAG recovered was manually refined with Anvi'o based on coverage uniformity and GC content (v.5), <ref type="bibr">[50,</ref><ref type="bibr">56]</ref>. See the acknowledgments section regarding the origin of the Groves Creek Salt Marsh MAGs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Annotation</head><p>Open reading frames were predicted for MAGs using Prodigal <ref type="bibr">[57]</ref> (v.2.6.3; default parameters). Functional annotation was determined using HMMER3 <ref type="bibr">[58]</ref> (v. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Phylogenetics</head><p>To define genome phylogenomic relationships of MAGs, 16 universal ribosomal proteins (RPs) were used L2-L6, L14-L16, L18, L22, L24, S3, S8, S10, S17, and S19. This dataset was not dereplicated using dRep to show variability in metabolism among closely related Cycloclasticus MAGs/genomes. For phylogenies of metabolic genes and ribosomal proteins, all representative sequences and concatenated alignments containing &lt;25% informative sites were excluded in tree construction. For phylogenetic trees of the PQQdependent alcohol dehydrogenase protein family, 16S rRNA gene, DMSO protein superfamily, and the copper-bound membrane monooxygenase (CuMMO) protein superfamily, all sequences used are in Supplementary Dataset S6 and Supplementary</p><p>. Genomes under "genome accession" were downloaded from NCBI or JGI and annotated according to the above "Annotation" section. The 16S rRNA genes were detected from the Cycloclasticus genome/MAG collection with RNAmmer <ref type="bibr">[66]</ref>. Among the collection of genomes used in this study, accession "TIGR03080" was used to find CuMMO/particulate hydrocarbon monooxygenase proteins, "TIGR01580" was used to find narG proteins, and the accession "PF01011" was used to find PQQ-dependent alcohol dehydrogenase proteins. In all phylogenetic trees, each protein was aligned using MUSCLE (v.3.8.425) <ref type="bibr">[62]</ref>. All columns with &gt;95% gaps were removed using TrimAL <ref type="bibr">[63]</ref>. Maximumlikelihood phylogenetic analysis of concatenated alignment was inferred by RAxML <ref type="bibr">[64]</ref> (v.8.9; parameters: raxmlHPC -T 4 -s input -N autoMRE -n result -f a -p 12345 -x 12345 -m PROTCATLG). The resulting trees were visualized using FigTree <ref type="bibr">[65]</ref> (v.1.4.3).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Metaproteomics</head><p>We analyzed metaproteomes from two of the OOV (open ocean variant) samples with corresponding MAGs ("3_C5_1" and "3_C5_2") as the reference databases. Proteins from each sample were extracted and prepared from &#188; filter (equivalent to ~60 mL of filtered water and ~1.3x10 8 bacteria) for liquid chromatography and tandem mass spectrometry (LC-MS/MS) using a protocol adapted from <ref type="bibr">[67]</ref>. Briefly, filters were cut into 2 mm pieces and submerged in 100 &#181;L of 6M urea and 600 &#181;L of <ref type="bibr">50</ref> mM NH 4 HCO 3 and sonicated with a Branson 250 Sonifier; 20 kHz, 5 x 20 sec on ice to lyse cells. Protein concentrations for each sample were quantified in triplicate using a Bicinchoninic Acid protein assay kit (Pierce Thermo Scientific) using a microplate reader. Proteins within the lysate were reduced and alkylated using dithiothreitol and iodoacetamide, respectively, digested with Trypsin (12 h; 1:20 enzyme to protein) and desalted with C18 centrifugal spin columns. Peptides were dried down and resuspended in 2% ACN, 0.1% formic acid before analysis with a nanoAcquity UPLC (Waters Corp., Milford, MA) in line with a Q-Exactive-HF (Thermo Fisher Scientific, Waltham, MA). Reverse phase chromatography was achieved</p><p>U N C O R R E C T E D M A N U S C R I P T </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Read-recruitment analysis</head><p>To better understand the biogeography of OOV-MAG and SV-MAG, we searched a representative genome from each variant against the Branchwater web interface to learn which datasets among millions of global metagenomes indexed by Branchwater contain matches filtered to 0.97 cANI to either genome <ref type="bibr">[71]</ref>. All projects matching OOV-MAG or SV-</p><p>MAG were downloaded and trimmed (according to the above parameters). To evaluate the prevalence of Cycloclasticus MAGs and single amplified genomes (SAGs) across these metagenomic datasets (including the Deepwater Horizon event), we first dereplicated the genomes/MAGs with stringency of 95% average nucleotide identity using dRep <ref type="bibr">[44]</ref>, then used Bowtie2 for read mapping <ref type="bibr">[47]</ref>[48] (v.2.3.4.1; default parameters) and analyzed the mapped reads using stringent parameters from InStrain <ref type="bibr">[72]</ref>; namely we filtered reads based on 92% similarity and only noted a genomes' presence when they achieved &gt;20%</p><p>breadth (meaning at least 20% of the genome was detected), and the expected breadth was within 20% of the observed breadth (indicating the reads were mapped randomly across the genome) <ref type="bibr">[65]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Results and Discussion</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Variant biogeography</head><p>Incubations conducted with seawater from 1,000 m depth containing ambient U N C O R R E C T E D M A N U S C R I P T Among the blooms, we identified two dominant Cycloclasticus variants. One, called the "seep variant" (SV), was the primary blooming population at Station 6, located within the northwestern GOM seep field. The other, termed the "open-ocean variant" (OOV), was the primary blooming organism at Station 3, the more offshore petroleum-depleted region (Fig. <ref type="figure">1a</ref>, Fig. <ref type="figure">2a</ref>, Supplementary Data S3). The distribution of SV and OOV was more varied at Stations 4 and 5, with both variants present at Station 4, but only OOV blooming at Station 5 (Fig. <ref type="figure">2a</ref>). In incubations enriched with n-pentane and sequenced for 16S rRNA gene analysis-regardless of bloom status-OOV was found to be more than 10% abundant in five of eight incubations from Stations 3-5 (Supplementary Information, Fig. <ref type="figure">S3</ref>), indicating its numerical dominance in areas with patchy bloom patterns. In contrast, SV was detected in only two of the eight incubations. While we refer to these variants as 'SV, seep variant' and 'OOV, open-ocean variant,' it is important to note that each may occur across these</p><p>environments, though we found them to be numerically dominant in their respective settings. A read-recruitment analysis showing the distribution of both SV-MAG and OOV-MAG across the Gulf of Mexico further supports this interpretation (Supplementary Dataset S5 and Supplementary Information Fig. <ref type="figure">S9</ref>).</p><p>Cell-specific respiration is higher for SV than the OOV (Fig. <ref type="figure">2b</ref>). The respiration profile (Fig. <ref type="figure">1c</ref>) of these variants also showed distinct patterns, where OOV presents more gradual oxygen consumption with time. The OOV was also observed in small relative abundances (&lt;1%) at Station 6 (Supplementary Dataset S2). In each n-pentane incubation where the two variants co-occurred, the SV numerically dominated by ~3 orders of magnitude except for the Colwellia/Cellvibrionaceae bloom at Station 6, where both variants were detected at &lt;1% abundance. This suggests that SV is better adapted to conditions associated with natural seepage, as it outcompeted OOV, which only bloomed at stations that are distant from seeps.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Pentane metabolism</head><p>Within both SV-and OOV-MAGs, we found genomic potential for n-pentane utilization for catabolism and anabolism (Fig. <ref type="figure">3</ref>). Our analyses were further supported by proteomic analysis of the OOV-MAG from Station 3 (gray stars in Fig. <ref type="figure">2a</ref>; Fig <ref type="figure">4b</ref>). The first step in the consumption of n-pentane is the oxidation to pentanol, and we hypothesize that this step is catalyzed by the copper-containing membrane-associated monooxygenase, called particulate hydrocarbon monooxygenases (phmo). The most well-characterized phmo is the particulate methane monooxygenase, which oxidizes methane to methanol <ref type="bibr">[73]</ref>. phmo has never been demonstrated to act on n-pentane, though it has shown activity on n-butane in previous work <ref type="bibr">[74]</ref>. We found multiple copies of genes encoding phmo in both Cycloclasticus MAG variants (Fig. <ref type="figure">3</ref>, Fig. <ref type="figure">4</ref>, Supplementary Information Fig. <ref type="figure">S8</ref>).</p><p>Each copy of phmo varies phylogenetically from the other copies within the same MAG, suggesting each operon may have different substrate specificities or capitalize on</p><p>alkanes of varying substrate concentrations <ref type="bibr">[75]</ref> (Fig. <ref type="figure">4</ref>). Both MAG-SV and MAG-OOV have phmo sequences that form monophyletic clades with reference sequences with demonstrated affinity for ethane and butane. Both variants also contain a sequence that forms a monophyletic clade that is only distantly related to a pmmo (particulate methane monooxygenase); however, this clade contains no currently validated reference sequences, and we refer to its function as "unknown.". Proteomics confirmed the expression of phmo, specifically subunits a and b (Fig. <ref type="figure">3</ref>, Fig. <ref type="figure">4</ref>) in the presence of pentane. The two phmo genes for which peptides were detected in OOV belong to a sequence from OOV-MAG in the "unknown" clade of phmos and one clade containing reference sequences with a demonstrated affinity for ethane and butane. The only detected hydrocarbon monooxygenase in SV-MAG is the phmo, supporting the hypothesis that this enzyme functions on n-pentane. AlkB, a gene known to function on medium to long-chain alkanes, was found encoded in the MAG-OOV; however, no peptides were observed in the proteomics analyses (Fig. <ref type="figure">3</ref>). Still, given the minimal sample size analyzed for proteomics and the potential for false negatives due to e.g. ionization and extraction efficiencies, we do not exclude the possibility that alkB could also be active in these samples and used to consume n-pentane by the OOV.</p><p>The second step in the consumption of pentane is the conversion of pentanol to an aldehyde. In many bacteria that oxidize alcohols, this reaction is catalyzed by pyrroloquinoline quinone-dependent alcohol dehydrogenases (pqq-adh). We found genes encoding pqq-adh in both Cycloclasticus MAG variants and proteomic expression of PQQ-ADH in OOV-MAG samples. (Fig. <ref type="figure">3</ref>, Supplementary Information Fig <ref type="figure">S2</ref>). None of the pqqadh genes formed a monophyletic clade with reference sequences known to act on methanol, providing evidence against methane metabolism in SV and OOV. In the third step of n-pentane consumption, the aldehydes are oxidized to carboxylic acids, which could be achieved via a tungsten-containing aldehyde ferredoxin oxidoreductase (aor), known to</p><p>use short-chain alkane-derived aldehydes as their substrate <ref type="bibr">[15,</ref><ref type="bibr">76]</ref>. This conversion can also be performed by pqq-adh, as activity on aldehydes has been confirmed with reference sequences related to those encoded by SV and OOV (Supplementary Information Fig <ref type="figure">S2)</ref>.</p><p>Here, pentanoate is likely beta-oxidized using acyl-CoA dehydrogenase and enoyl-CoA hydratase and shunted into central carbon metabolism via the citric acid cycle (Fig. <ref type="figure">3</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Differences in variant metabolic potential</head><p>The metabolic capabilities of the SV-MAG and OOV-MAG differ substantially (Fig. <ref type="figure">3</ref>, Fig. <ref type="figure">4</ref>). The OOV-MAG encodes for general hydrocarbon metabolism that includes the nearly complete pathway for toluene consumption via the toluene monooxygenase conversion of toluene to benzoate (7 of 8 genes), benzoate conversion to catechol (3 of 4 genes), and the catechol metacleavage to acetyl-CoA which enters the tricarboxylic acid cycle (13 of 13 genes). The OOV-MAG also encodes toluene 2-monooxygenase, which converts benzene to catechol (6 of 6 genes) that can also be shunted through the same catechol meta-cleavage pathway as toluene to form acetyl-CoA (13 of 13 genes) and enter the tricarboxylic acid cycle. The OOV could also use the toluene-2 monooxygenase system to convert toluene to 3-methylcatechol (6 of 6 genes) and then convert 3-methylcatechol to acetyl-CoA and shunt to the tricarboxylic acid cycle (3 of 5 genes). Furthermore, the OOV-MAG encodes for alkB (1 of 1 gene), which is commonly used by other organisms for consumption of long-chain alkanes via beta-oxidation (OOV encodes 7 of 7 genes), resulting in propionyl-CoA and acetyl-CoA, which are also incorporated into the tricarboxylic acid cycle. Neither OOV-MAG or SV-MAG (or any other Cycloclasticus MAGs analyzed in this study) encode a complete canonical naphthalene degradation pathway (naphthalene 1,2, dioxygenase is missing from all genomes/MAGs), yet the strain Cycloclasticus SP-1 has been experimentally validated to use naphthalene as a sole carbon source (Fig. <ref type="figure">5b</ref>) <ref type="bibr">[28]</ref>. Cycloclasticus SP-1 and OOV-MAG encode 3 of 10 genes for naphthalene</p><p>degradation, which indicates that OOV-MAG can also likely metabolize naphthalene, whereas SV encodes 0 of 10 genes.</p><p>Overall, the SV-MAG lacks many metabolic pathways for longer-chain alkanes and aromatic compounds compared to the OOV-MAG, seemingly limiting its hydrocarbon metabolism potential (Fig. <ref type="figure">4</ref>, Fig. <ref type="figure">5b</ref>). These observed differences are consistent with SV specialization on short-chain, aqueous-soluble alkanes and biogeography that includes seeding from the petroleum-rich source region in the Northern Gulf of Mexico. The genomic capacity for catabolism of multiple hydrocarbon classes in the OOV-MAG is consistent with an ability to capitalize on diffuse hydrocarbon sources that expand beyond natural seeps, including atmospheric deposition, terrestrial runoff, biogenic inputs, and oil spills. This enhanced capacity in OOV is consistent with an expanded biogeographic range relative to SV, which appears to be more highly reliant on substrate sourced from natural seepage (Supplementary Dataset S5 and Supplementary Information Fig. <ref type="figure">S9</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Anaerobic metabolism in Cycloclasticus</head><p>Anaerobic metabolism has yet to be observed in Cycloclasticus, and it remains unknown how these bacteria could contribute to hydrocarbon cycling in oxygen minimum zones or anoxic sediments. Here, we show the OOV-MAG of Cycloclasticus exhibits adaptations for life without oxygen, including the occurrence of genes for respiratory nitrate reductase (Nar), as well as a potential linkage to thiosulfate metabolism (Fig. <ref type="figure">4</ref>). In OOV-MAG, we identify a complete canonical nar operon (narGHJI) encoding: i) the &#945; subunit responsible for catalyzing NO 3 -reduction to NO 2 -(narG); ii) the iron and sulfur-containing &#946; subunit (narH) that transfers electrons to the molybdenum cofactor of narG; iii) the narJ chaperone used in enzyme formation and iv) the transmembrane &#947; subunit (narI) involved in electron transfer from membrane quinols to narH. Phylogenetic placement of Cycloclasticus narG sequences also confirms the relation to narG reference sequences (Supplementary Information Fig. <ref type="figure">S1</ref>).</p><p>OOV-MAG also contains the sox operon (soxCDXYZAB), which encodes periplasmic sulfur-oxidizing proteins (Fig. <ref type="figure">4</ref>). This operon can be used as a means of detoxification in some Gammaproteobacteria <ref type="bibr">[8]</ref>; however, we do not exclude the possibility that Cycloclasticus could employ a lithoheterotrophic strategy. The use of thiosulfate to supplement heterotrophy is a strategy that has been demonstrated in other Proteobacteria and could be useful in seeps and other benthic environments <ref type="bibr">[77]</ref>. It is unclear how members of Cycloclasticus may access n-pentane in the absence of oxygen. No enzymes related to alkyl succinate synthase were detected. Multiple putative hits for the DMSO protein superfamily were detected, and this superfamily encompasses a variety of functions, including the anaerobic alkane degradation enzyme, alkane C2 methylene hydroxylase; however, OOV-MAG sequences do not form a monophyletic clade with reference sequences of this function (data not shown).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Deepwater Horizon Cycloclasticus</head><p>The microbial response to the 2010 Deepwater Horizon blowout in the Gulf of Mexico induced blooms of Cycloclasticus in the deep ocean from large-scale intrusions of dissolved hydrocarbons <ref type="bibr">[78]</ref>. These DWH blooms included multiple Cycloclasticus 16S rRNA gene sequence variants, which led us to ask whether SV and OOV were among those DWH variants <ref type="bibr">[71]</ref>. We analyzed the 16S rRNA gene content and conducted highthroughput sequencing on a sample collected while active flow occurred from the wellhead into the GOM. At the depth the sample was collected, there was an oxygen anomaly characteristic of the respiratory response associated with the DWH subsurface intrusions <ref type="bibr">[79]</ref> (Supplementary Information Fig. <ref type="figure">S6</ref>). Upon initial analysis of the microbial community via the V4 region (252 bp) of the 16S rRNA gene, we found the SV-MAG to be identical to the dominant member of the DWH sample and OOV-MAG to be identical to the second most abundant Cycloclasticus 16S rRNA single nucleotide variant (Fig. <ref type="figure">2a</ref>).</p><p>Using read-recruitment of metagenomic sequences from the same sample, we find that the fraction of the SV-MAG covered by the DWH reads spans 98% of the MAG with approximately 180X coverage. The OOV-MAG is 100% covered from reads mapped from the DWH sample with approximately 21X coverage (Supplementary Dataset S5). We reconstructed a high-quality metagenome, here named "MAG_DWH_1", which is 94% complete and 3.3% redundant. Upon expanding our analysis to the full-length 16S rRNA gene (as opposed to the V4 region in Fig 2 ), we find that the SV-MAG is 99.5% identical to MAG_DWH_1. Through a phylogenomic analysis of 16 ribosomal proteins, we find MAG_DWH_1 forms a monophyletic clade with SV-MAG (Fig. <ref type="figure">5a</ref>, Supplementary Information Fig. <ref type="figure">S4</ref>). For comparison, we also drew from our previously published single amplified genomes (SAGs) from DWH, which are 71%, 49%, and 46% complete and herein referred to as "SAG_DWH_3", "SAG_DWH_1", and "SAG_DWH_2" <ref type="bibr">[15]</ref>. We find that "SAG_DWH_1" and "SAG_DWH_3" are closely related to OOV-MAG, whereas "SAG_DWH_2" appears to be related to SV-MAG (Supplementary Information Fig. <ref type="figure">S4</ref>). For the relation of SV and OOV to the SAGs and MAG_DWH_1, we also find supporting evidence in the analysis of Average Nucleotide Identity and the 16S gene rRNA phylogeny (Supplementary Information Fig. <ref type="figure">S4</ref> and <ref type="figure">S5</ref>). These results indicate a previously unrecognized distinction in the microbial response to the DWH event -that SV-like Cycloclasticus may have responded specifically to the highly abundant soluble n-alkanes.</p><p>In contrast, OOV-like Cycloclasticus may have responded to soluble n-alkanes and other components, including benzene and toluene.</p><p>To further assess the ecological relevance of SV and OOV Cycloclasticus to DWH, we compared the similarities in the phmo phylogenetic placement between SV-and OOV-MAGs and previously published transcripts from DWH subsurface plumes (Fig. <ref type="figure">4</ref>) <ref type="bibr">[80]</ref>.</p><p>These results indicate that phmo genes most closely related to SV and OOV Cycloclasticus were expressed at high relative abundance during the DWH event, consistent with data</p><p>showing the rapid microbial response by Cycloclasticus to short-chain n-alkanes (but not methane) concurrent with active discharge <ref type="bibr">[78]</ref>. The pulse of bacterial growth in the deep ocean from the DWH event has been estimated at &gt; 10 23 cells, with a substantial fraction being SV Cycloclasticus. We, therefore, questioned if this level of ecological disturbance might have structured the hydrocarbon-degrading community in the GOM through 2015 when samples for this work were collected. More data is needed to assess this hypothesis rigorously. Other researchers did find that methanotrophic biomass remained elevated in the years following the DWH event, perpetuating elevated methanotrophic activity above the background levels existing before the disaster <ref type="bibr">[81]</ref>. Therefore, it remains possible that the Cycloclasticus observed in our pentane incubations was poised to bloom five years following the spill due to some form of memory effect from the large influx of biomass caused by the disaster.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Hydrocarbon metabolism across Cycloclasticus</head><p>To understand how hydrocarbon metabolic capability within Cycloclasticus relates to ecological and evolutionary patterns, we reconstructed Cycloclasticus MAGs from various environments using publicly available datasets (Supplementary Dataset S4). This effort resulted in eight high-quality MAGs with completion of &gt;80% and &lt;2% redundancy. These eight MAGs are in addition to the five pentane MAGs and the one DWH-MAG already discussed and includes one from the uncontaminated North Sea "NS_1", six from a coastal salt marsh in Skidaway Island, Georgia, "CSM_1", "CSM_2", "CSM_3", "CSM_4", "CSM_5", and "CSM_6", and one MAG from coastal seawater near Pivers Island, North Carolina "CSW_1". The 14 MAGs reconstructed for this study, along with other publicly available genomes, were used to form a phylogenomic tree of all Cycloclasticus (Fig. <ref type="figure">5a</ref>). Each genome was then scanned for hydrocarbon-related pathways of interest and other metabolic functions related to energy generation (Fig. <ref type="figure">5b</ref>).</p><p>From the phylogenetic analysis of ribosomal proteins alongside metabolic data, we observe distinct strategies by each major clade within the Cycloclasticus genera (Fig. <ref type="figure">5b</ref>).</p><p>All cultivated Cycloclasticus are very closely related to each other (Fig. <ref type="figure">5a</ref>, bottom clade, denoted "isolates"). We found no evidence of genes for consuming short-chain alkanes within this clade. This is a major bias in our understanding of Cycloclasticus, because all other Cycloclasticus MAGs analyzed contained phmo genes. We also observe two watercolumn clades from uncontaminated seawater that harbor diverse pathways for short-chain and long-chain alkanes, as well as near-complete pathways for naphthalene and xylene, and complete pathways for toluene and benzene consumption. Altogether, we find a minimum of seven clades within the Cycloclasticus, seemingly unified as marine organisms that grow from aqueous soluble hydrocarbons. One key factor distinguishing the clades is the evolved preference to access certain classes of aqueous soluble hydrocarbons and not others.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Conclusion</head><p>Our study provides genomic and proteomic evidence for n-pentane metabolism by</p><p>free-living members of the Cycloclasticus genus in contrasting oceanic regimes, one with prolific natural seep influence and another farther removed from prolific seepage. By comparing Cycloclasticus genomes and MAGs, we show that the hydrocarbon metabolism within this genus is not limited to PAH degradation, with genomic variability enabling different ecotypes to access different ecological niches and structural classes of hydrocarbons. The apparent commonality among Cycloclasticus is not the ability to consume aromatic hydrocarbons, as the genus name suggests, but rather a metabolic specialization among the subset of hydrocarbons that exhibit aqueous solubility in marine settings.    U N C O R R E C T E D M A N U S C R I P T Figure 3. Carbon, nitrogen, and sulfur metabolism present in Cycloclasticus variants blooming on pentane. a Open ocean variant, OOV-MAG, and b Seep variant SV-MAG. Yellow boxes indicate a reaction (and its reference number) that could be linked with a predicted metabolic function, see Supplementary Dataset S3. Blue boxes indicate peptides for that enzyme were observed in proteomic data. Proteomics performed on only OOV-MAG. If the reaction box describes multiple enzymes, only one needs to be observed in proteomic data for it to be colored blue. Enzyme abbreviations: particulate hydrocarbon monooxygenase phmo (A, B, C, D); PQQ-dependent alcohol dehydrogenase (pqq-adh ); aldehyde oxidoreductase (aor); 2-methylcitrate synthase (2-mcs); 2methylcitrate dehydratase (2-MCD); 2-methylcitrate isomerase (2-MAI); 2-methylisocitrate dehydratase (2-MID); methylisocitrate lyase (MICL); nitrite reductase (Nir); respiratory nitrate reductase (Nar); thiosulfate oxidation complex (Sox); alkane-1-monooxygenase (alkB); toluene monooxygenase (tmoA); phenol/toluene 2-monooxygenase (pH). The tricarboxylic acid (TCA) and beta-oxidation pathway are highlighted in blue and peach colors.   </p><note type="other">Figure Legends</note></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>Downloaded from https://academic.oup.com/ismej/advance-article/doi/10.1093/ismejo/wrae247/7927902 by University of California user on 29 January 2025</p></note>
		</body>
		</text>
</TEI>
