Abstract. Marine dinitrogen (N2) fixation is a globally significant biogeochemical process carried out by a specialized group of prokaryotes (diazotrophs), yet our understanding of their ecology is constantly evolving. Although marine N2 fixation is often ascribed to cyanobacterial diazotrophs, indirect evidence suggests that non-cyanobacterial diazotrophs (NCDs) might also be important. One widely used approach for understanding diazotroph diversity and biogeography is polymerase chain reaction (PCR) amplification of a portion of the nifH gene, which encodes a structural component of the N2-fixing enzyme complex, nitrogenase. An array of bioinformatic tools exists to process nifH amplicon data; however, the lack of standardized practices has hindered cross-study comparisons. This has led to a missed opportunity to more thoroughly assess diazotroph diversity and biogeography, as well as their potential contributions to the marine N cycle. To address these knowledge gaps, a bioinformatic workflow was designed that standardizes the processing of nifH amplicon datasets originating from high-throughput sequencing (HTS). Multiple datasets are efficiently and consistently processed with a specialized DADA2 pipeline to identify amplicon sequence variants (ASVs). A series of customizable post-pipeline stages then detect and discard spurious nifH sequences and annotate the subsequent quality-filtered nifH ASVs using multiple reference databases and classification approaches. This newly developed workflow was used to reprocess nearly all publicly available nifH amplicon HTS datasets from marine studies and to generate a comprehensive nifH ASV database containing 9383 ASVs aggregated from 21 studies that represent the diazotrophic populations in the global ocean. For each sample, the database includes physical and chemical metadata obtained from the Simons Collaborative Marine Atlas Project (CMAP). Here we demonstrate the utility of this database for revealing global biogeographical patterns of prominent diazotroph groups and highlight the influence of sea surface temperature. The workflow and nifH ASV database provide a robust framework for studying marine N2 fixation and diazotrophic diversity captured by nifH amplicon HTS. Future datasets that target understudied ocean regions can be added easily, and users can tune parameters and studies included for their specific focus. The workflow and database are available, respectively, on GitHub (https://github.com/jdmagasin/nifH-ASV-workflow, last access: 21 January 2025; Morando et al., 2024c) and Figshare (https://doi.org/10.6084/m9.figshare.23795943.v2; Morando et al., 2024b).
more »
« less
NifH gene amplicon sequencing and metagenomic approaches are complementary in assessing diazotroph diversity
Abstract Exploring the diversity of diazotrophs is key to understanding their role in supplying fixed nitrogen that supports marine productivity. A nested PCR assay using the universal primer set nifH1-nifH4, which targets the nitrogenase (nifH) gene, is a widely used approach for studying marine diazotrophs by amplicon sequencing. Metagenomics, direct sequencing of DNA without PCR, has provided complementary views of the diversity of marine diazotrophs. A significant fraction of the metagenome-derived nifH sequences (e.g. Planctomycete- and Proteobacteria-affiliated) were reported to have nucleotide mismatches with the nifH1-nifH4 primers, leading to the suggestion that nifH amplicon sequencing does not detect specific diazotrophic taxa and underrepresents diazotroph diversity. Here, we report that these mismatches are mostly located in a single-base at the 5′-end of the nifH4 primer, which does not impact detection of the nifH genes. This is demonstrated by the presence of nifH genes that contain the nucleotide mismatches in a recent compilation of global ocean nifH amplicon datasets, with high relative abundances detected in a variety of samples. While the metagenome- and metatranscriptome-derived nifH genes accounted for 4.4% of the total amplicon sequence variants from the global ocean nifH amplicon database, the corresponding amplicon sequence variants can have high relative abundances (accounting for 47% of the reads in the database). These analyses underscore that nifH amplicon sequencing using the nifH1-nifH4 primers is an important tool for studying diversity of marine diazotrophs, particularly as a complement to metagenomics which can provide taxonomic and metabolic information for some dominant groups.
more »
« less
- Award ID(s):
- 2023498
- PAR ID:
- 10580986
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- ISME Communications
- Volume:
- 5
- Issue:
- 1
- ISSN:
- 2730-6151
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Gilbert, Jack A. (Ed.)ABSTRACT Small subunit rRNA (SSU rRNA) amplicon sequencing can quantitatively and comprehensively profile natural microbiomes, representing a critically important tool for studying diverse global ecosystems. However, results will only be accurate if PCR primers perfectly match the rRNA of all organisms present. To evaluate how well marine microorganisms across all 3 domains are detected by this method, we compared commonly used primers with >300 million rRNA gene sequences retrieved from globally distributed marine metagenomes. The best-performing primers compared to 16S rRNA of bacteria and archaea were 515Y/926R and 515Y/806RB, which perfectly matched over 96% of all sequences. Considering cyanobacterial and chloroplast 16S rRNA, 515Y/926R had the highest coverage (99%), making this set ideal for quantifying marine primary producers. For eukaryotic 18S rRNA sequences, 515Y/926R also performed best (88%), followed by V4R/V4RB (18S rRNA specific; 82%)—demonstrating that the 515Y/926R combination performs best overall for all 3 domains. Using Atlantic and Pacific Ocean samples, we demonstrate high correspondence between 515Y/926R amplicon abundances (generated for this study) and metagenomic 16S rRNA (median R 2 = 0.98, n = 272), indicating amplicons can produce equally accurate community composition data compared with shotgun metagenomics. Our analysis also revealed that expected performance of all primer sets could be improved with minor modifications, pointing toward a nearly completely universal primer set that could accurately quantify biogeochemically important taxa in ecosystems ranging from the deep sea to the surface. In addition, our reproducible bioinformatic workflow can guide microbiome researchers studying different ecosystems or human health to similarly improve existing primers and generate more accurate quantitative amplicon data. IMPORTANCE PCR amplification and sequencing of marker genes is a low-cost technique for monitoring prokaryotic and eukaryotic microbial communities across space and time but will work optimally only if environmental organisms match PCR primer sequences exactly. In this study, we evaluated how well primers match globally distributed short-read oceanic metagenomes. Our results demonstrate that primer sets vary widely in performance, and that at least for marine systems, rRNA amplicon data from some primers lack significant biases compared to metagenomes. We also show that it is theoretically possible to create a nearly universal primer set for diverse saline environments by defining a specific mixture of a few dozen oligonucleotides, and present a software pipeline that can guide rational design of primers for any environment with available meta’omic data.more » « less
-
Abstract Biological dinitrogen (N2) fixation supplies nitrogen to the oceans, supporting primary productivity, and is carried out by some bacteria and archaea referred to as diazotrophs. Cyanobacteria are conventionally considered to be the major contributors to marine N2 fixation, but non-cyanobacterial diazotrophs (NCDs) have been shown to be distributed throughout ocean ecosystems. However, the biogeochemical significance of marine NCDs has not been demonstrated. This review synthesizes multiple datasets, drawing from cultivation-independent molecular techniques and data from extensive oceanic expeditions, to provide a comprehensive view into the diversity, biogeography, ecophysiology, and activity of marine NCDs. A NCD nifH gene catalog was compiled containing sequences from both PCR-based and PCR-free methods, identifying taxa for future studies. NCD abundances from a novel database of NCD nifH-based abundances were colocalized with environmental data, unveiling distinct distributions and environmental drivers of individual taxa. Mechanisms that NCDs may use to fuel and regulate N2 fixation in response to oxygen and fixed nitrogen availability are discussed, based on a metabolic analysis of recently available Tara Oceans expedition data. The integration of multiple datasets provides a new perspective that enhances understanding of the biology, ecology, and biogeography of marine NCDs and provides tools and directions for future research.more » « less
-
Summary Universal primers for SSU rRNA genes allow profiling of natural communities by simultaneously amplifying templates from Bacteria, Archaea, and Eukaryota in a single PCR reaction. Despite the potential to show relative abundance for all rRNA genes, universal primers are rarely used, due to various concerns including amplicon length variation and its effect on bioinformatic pipelines. We thus developed 16S and 18S rRNA mock communities and a bioinformatic pipeline to validate this approach. Using these mocks, we show that universal primers (515Y/926R) outperformed eukaryote‐specific V4 primers in observed versus expected abundance correlations (slope = 0.88 vs. 0.67–0.79), and mock community members with single mismatches to the primer were strongly underestimated (threefold to eightfold). Using field samples, both primers yielded similar 18S beta‐diversity patterns (Mantel test,p < 0.001) but differences in relative proportions of many rarer taxa. To test for length biases, we mixed mock communities (16S + 18S) before PCR and found a twofold underestimation of 18S sequences due to sequencing bias. Correcting for the twofold underestimation, we estimate that, in Southern California field samples (1.2–80 μm), there were averages of 35% 18S, 28% chloroplast 16S, and 37% prokaryote 16S rRNA genes. These data demonstrate the potential for universal primers to generate comprehensive microbiome profiles.more » « less
-
Abstract The photosynthetic cyanobacterium Trichodesmium is widely distributed in the surface low latitude ocean where it contributes significantly to N2 fixation and primary productivity. Previous studies found nifH genes and intact Trichodesmium colonies in the sunlight-deprived meso- and bathypelagic layers of the ocean (200–4000 m depth). Yet, the ability of Trichodesmium to fix N2 in the dark ocean has not been explored. We performed 15N2 incubations in sediment traps at 170, 270 and 1000 m at two locations in the South Pacific. Sinking Trichodesmium colonies fixed N2 at similar rates than previously observed in the surface ocean (36–214 fmol N cell−1 d−1). This activity accounted for 40 ± 28% of the bulk N2 fixation rates measured in the traps, indicating that other diazotrophs were also active in the mesopelagic zone. Accordingly, cDNA nifH amplicon sequencing revealed that while Trichodesmium accounted for most of the expressed nifH genes in the traps, other diazotrophs such as Chlorobium and Deltaproteobacteria were also active. Laboratory experiments simulating mesopelagic conditions confirmed that increasing hydrostatic pressure and decreasing temperature reduced but did not completely inhibit N2 fixation in Trichodesmium. Finally, using a cell metabolism model we predict that Trichodesmium uses photosynthesis-derived stored carbon to sustain N2 fixation while sinking into the mesopelagic. We conclude that sinking Trichodesmium provides ammonium, dissolved organic matter and biomass to mesopelagic prokaryotes.more » « less
An official website of the United States government
