Establishing links between microbial diversity and environmental processes requires resolving the high degree of functional variation among closely related lineages or ecotypes. Here, we implement and validate an improved metagenomic approach that estimates the spatial biogeography and environmental regulation of ecotype-specific replication patterns (RObs) across ocean regions. A total of 719 metagenomes were analyzed from meridional Bio-GO-SHIP sections in the Atlantic and Indian Ocean. Accounting for sequencing bias and anchoring replication estimates in genome structure were critical for identifying physiologically relevant biological signals. For example, ecotypes within the dominant marine cyanobacteria Prochlorococcus exhibited distinct diel cycles in RObs that peaked between 19:00–22:00. Additionally, both Prochlorococcus ecotypes and ecotypes within the highly abundant heterotroph Pelagibacter (SAR11) demonstrated systematic biogeographies in RObs that differed from spatial patterns in relative abundance. Finally, RObs was significantly regulated by nutrient stress and temperature, and explained by differences in the genomic potential for nutrient transport, energy production, cell wall structure, and replication. Our results suggest that our new approach to estimating replication is reflective of gross population growth. Moreover, this work reveals that the interaction between adaptation and environmental change drives systematic variability in replication patterns across ocean basins that is ecotype-specific, adding an activity-based dimension to our understanding of microbial niche space.
Detailed descriptions of microbial communities have lagged far behind physical and chemical measurements in the marine environment. Here, we present 971 globally distributed surface ocean metagenomes collected at high spatio-temporal resolution. Our low-cost metagenomic sequencing protocol produced 3.65 terabases of data, where the median number of base pairs per sample was 3.41 billion. The median distance between sampling stations was 26 km. The metagenomic libraries described here were collected as a part of a biological initiative for the Global Ocean Ship-based Hydrographic Investigations Program, or “Bio-GO-SHIP.” One of the primary aims of GO-SHIP is to produce high spatial and vertical resolution measurements of key state variables to directly quantify climate change impacts on ocean environments. By similarly collecting marine metagenomes at high spatiotemporal resolution, we expect that this dataset will help answer questions about the link between microbial communities and biogeochemical fluxes in a changing ocean.
more » « less- PAR ID:
- 10360652
- Publisher / Repository:
- Nature Publishing Group
- Date Published:
- Journal Name:
- Scientific Data
- Volume:
- 8
- Issue:
- 1
- ISSN:
- 2052-4463
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract -
Abstract Sequence classification facilitates a fundamental understanding of the structure of microbial communities. Binary metagenomic sequence classifiers are insufficient because environmental metagenomes are typically derived from multiple sequence sources. Here we introduce a deep-learning based sequence classifier, DeepMicroClass, that classifies metagenomic contigs into five sequence classes, i.e. viruses infecting prokaryotic or eukaryotic hosts, eukaryotic or prokaryotic chromosomes, and prokaryotic plasmids. DeepMicroClass achieved high performance for all sequence classes at various tested sequence lengths ranging from 500 bp to 100 kbps. By benchmarking on a synthetic dataset with variable sequence class composition, we showed that DeepMicroClass obtained better performance for eukaryotic, plasmid and viral contig classification than other state-of-the-art predictors. DeepMicroClass achieved comparable performance on viral sequence classification with geNomad and VirSorter2 when benchmarked on the CAMI II marine dataset. Using a coastal daily time-series metagenomic dataset as a case study, we showed that microbial eukaryotes and prokaryotic viruses are integral to microbial communities. By analyzing monthly metagenomes collected at HOT and BATS, we found relatively higher viral read proportions in the subsurface layer in late summer, consistent with the seasonal viral infection patterns prevalent in these areas. We expect DeepMicroClass will promote metagenomic studies of under-appreciated sequence types.
-
Abstract Historically, our understanding of bacterial ecology in the Indian Ocean has been limited to regional studies that place emphasis on community structure and function within oxygen‐minimum zones. Thus, bacterial community dynamics across the wider Indian Ocean are largely undescribed. As part of Bio‐GO‐SHIP, we sequenced the 16S rRNA gene from 465 samples collected on sections I07N and I09N. We found that (1) there were 23 distinct bioregions within the Indian Ocean, (2) the southeastern gyre had the largest gradient in bacterial alpha‐diversity, (3) the Indian Ocean surface microbiome was primarily composed of a core set of taxa, and (4) bioregions were characterized by transitions in physical and geochemical conditions. Overall, we showed that bacterial community structure spatially delineated the surface Indian Ocean and that these microbially defined regions were reflective of subtle ocean physical and geochemical gradients. Therefore, incorporating metrics of in situ microbial communities into marine ecological regions traditionally defined by remote sensing will improve our ability to delineate warm, oligotrophic regions.
-
Abstract Concentrations and elemental stoichiometry of suspended particulate organic carbon, nitrogen, phosphorus, and oxygen demand for respiration (C:N:P:−O 2 ) play a vital role in characterizing and quantifying marine elemental cycles. Here, we present Version 2 of the Global Ocean Particulate Organic Phosphorus, Carbon, Oxygen for Respiration, and Nitrogen (GO-POPCORN) dataset. Version 1 is a previously published dataset of particulate organic matter from 70 different studies between 1971 and 2010, while Version 2 is comprised of data collected from recent cruises between 2011 and 2020. The combined GO-POPCORN dataset contains 2673 paired surface POC/N/P measurements from 70°S to 73°N across all major ocean basins at high spatial resolution. Version 2 also includes 965 measurements of oxygen demand for organic carbon respiration. This new dataset can help validate and calibrate the next generation of global ocean biogeochemical models with flexible elemental stoichiometry. We expect that incorporating variable C:N:P:-O 2 into models will help improve our estimates of key ocean biogeochemical fluxes such as carbon export, nitrogen fixation, and organic matter remineralization.more » « less
-
Gralnick, Jeffrey A. (Ed.)ABSTRACT Reconstructing microbial genomes from metagenomic short-read data can be challenging due to the unknown and uneven complexity of microbial communities. This complexity encompasses highly diverse populations, which often includes strain variants. Reconstructing high-quality genomes is a crucial part of the metagenomic workflow, as subsequent ecological and metabolic inferences depend on their accuracy, quality, and completeness. In contrast to microbial communities in other ecosystems, there has been no systematic assessment of genome-centric metagenomic workflows for drinking water microbiomes. In this study, we assessed the performance of a combination of assembly and binning strategies for time series drinking water metagenomes that were collected over 6 months. The goal of this study was to identify the combination of assembly and binning approaches that result in high-quality and -quantity metagenome-assembled genomes (MAGs), representing most of the sequenced metagenome. Our findings suggest that the metaSPAdes coassembly strategies had the best performance, as they resulted in larger and less fragmented assemblies, with at least 85% of the sequence data mapping to contigs greater than 1 kbp. Furthermore, a combination of metaSPAdes coassembly strategies and MetaBAT2 produced the highest number of medium-quality MAGs while capturing at least 70% of the metagenomes based on read recruitment. Utilizing different assembly/binning approaches also assists in the reconstruction of unique MAGs from closely related species that would have otherwise collapsed into a single MAG using a single workflow. Overall, our study suggests that leveraging multiple binning approaches with different metaSPAdes coassembly strategies may be required to maximize the recovery of good-quality MAGs. IMPORTANCE Drinking water contains phylogenetic diverse groups of bacteria, archaea, and eukarya that affect the esthetic quality of water, water infrastructure, and public health. Taxonomic, metabolic, and ecological inferences of the drinking water microbiome depend on the accuracy, quality, and completeness of genomes that are reconstructed through the application of genome-resolved metagenomics. Using time series metagenomic data, we present reproducible genome-centric metagenomic workflows that result in high-quality and -quantity genomes, which more accurately signifies the sequenced drinking water microbiome. These genome-centric metagenomic workflows will allow for improved taxonomic and functional potential analysis that offers enhanced insights into the stability and dynamics of drinking water microbial communities.more » « less