skip to main content


Title: Evaluating and Improving Small Subunit rRNA PCR Primer Coverage for Bacteria, Archaea, and Eukaryotes Using Metagenomes from Global Ocean Surveys
ABSTRACT Small subunit rRNA (SSU rRNA) amplicon sequencing can quantitatively and comprehensively profile natural microbiomes, representing a critically important tool for studying diverse global ecosystems. However, results will only be accurate if PCR primers perfectly match the rRNA of all organisms present. To evaluate how well marine microorganisms across all 3 domains are detected by this method, we compared commonly used primers with >300 million rRNA gene sequences retrieved from globally distributed marine metagenomes. The best-performing primers compared to 16S rRNA of bacteria and archaea were 515Y/926R and 515Y/806RB, which perfectly matched over 96% of all sequences. Considering cyanobacterial and chloroplast 16S rRNA, 515Y/926R had the highest coverage (99%), making this set ideal for quantifying marine primary producers. For eukaryotic 18S rRNA sequences, 515Y/926R also performed best (88%), followed by V4R/V4RB (18S rRNA specific; 82%)—demonstrating that the 515Y/926R combination performs best overall for all 3 domains. Using Atlantic and Pacific Ocean samples, we demonstrate high correspondence between 515Y/926R amplicon abundances (generated for this study) and metagenomic 16S rRNA (median R 2 = 0.98, n  = 272), indicating amplicons can produce equally accurate community composition data compared with shotgun metagenomics. Our analysis also revealed that expected performance of all primer sets could be improved with minor modifications, pointing toward a nearly completely universal primer set that could accurately quantify biogeochemically important taxa in ecosystems ranging from the deep sea to the surface. In addition, our reproducible bioinformatic workflow can guide microbiome researchers studying different ecosystems or human health to similarly improve existing primers and generate more accurate quantitative amplicon data. IMPORTANCE PCR amplification and sequencing of marker genes is a low-cost technique for monitoring prokaryotic and eukaryotic microbial communities across space and time but will work optimally only if environmental organisms match PCR primer sequences exactly. In this study, we evaluated how well primers match globally distributed short-read oceanic metagenomes. Our results demonstrate that primer sets vary widely in performance, and that at least for marine systems, rRNA amplicon data from some primers lack significant biases compared to metagenomes. We also show that it is theoretically possible to create a nearly universal primer set for diverse saline environments by defining a specific mixture of a few dozen oligonucleotides, and present a software pipeline that can guide rational design of primers for any environment with available meta’omic data.  more » « less
Award ID(s):
1737409
NSF-PAR ID:
10277548
Author(s) / Creator(s):
; ; ;
Editor(s):
Gilbert, Jack A.
Date Published:
Journal Name:
mSystems
Volume:
6
Issue:
3
ISSN:
2379-5077
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary

    Universal primers for SSU rRNA genes allow profiling of natural communities by simultaneously amplifying templates from Bacteria, Archaea, and Eukaryota in a single PCR reaction. Despite the potential to show relative abundance for all rRNA genes, universal primers are rarely used, due to various concerns including amplicon length variation and its effect on bioinformatic pipelines. We thus developed 16S and 18S rRNA mock communities and a bioinformatic pipeline to validate this approach. Using these mocks, we show that universal primers (515Y/926R) outperformed eukaryote‐specific V4 primers in observed versus expected abundance correlations (slope = 0.88 vs. 0.67–0.79), and mock community members with single mismatches to the primer were strongly underestimated (threefold to eightfold). Using field samples, both primers yielded similar 18S beta‐diversity patterns (Mantel test,p < 0.001) but differences in relative proportions of many rarer taxa. To test for length biases, we mixed mock communities (16S + 18S) before PCR and found a twofold underestimation of 18S sequences due to sequencing bias. Correcting for the twofold underestimation, we estimate that, in Southern California field samples (1.2–80 μm), there were averages of 35% 18S, 28% chloroplast 16S, and 37% prokaryote 16S rRNA genes. These data demonstrate the potential for universal primers to generate comprehensive microbiome profiles.

     
    more » « less
  2. Abstract Background The 16S mitochondrial rRNA gene is the most widely sequenced molecular marker in amphibian systematic studies, making it comparable to the universal CO1 barcode that is more commonly used in other animal groups. However, studies employ different primer combinations that target different lengths/regions of the 16S gene ranging from complete gene sequences (~ 1500 bp) to short fragments (~ 500 bp), the latter of which is the most ubiquitously used. Sequences of different lengths are often concatenated, compared, and/or jointly analyzed to infer phylogenetic relationships, estimate genetic divergence ( p -distances), and justify the recognition of new species (species delimitation), making the 16S gene region, by far, the most influential molecular marker in amphibian systematics. Despite their ubiquitous and multifarious use, no studies have ever been conducted to evaluate the congruence and performance among the different fragment lengths. Results Using empirical data derived from both Sanger-based and genomic approaches, we show that full-length 16S sequences recover the most accurate phylogenetic relationships, highest branch support, lowest variation in genetic distances (pairwise p -distances), and best-scoring species delimitation partitions. In contrast, widely used short fragments produce inaccurate phylogenetic reconstructions, lower and more variable branch support, erratic genetic distances, and low-scoring species delimitation partitions, the numbers of which are vastly overestimated. The relatively poor performance of short 16S fragments is likely due to insufficient phylogenetic information content. Conclusions Taken together, our results demonstrate that short 16S fragments are unable to match the efficacy achieved by full-length sequences in terms of topological accuracy, heuristic branch support, genetic divergences, and species delimitation partitions, and thus, phylogenetic and taxonomic inferences that are predicated on short 16S fragments should be interpreted with caution. However, short 16S fragments can still be useful for species identification, rapid assessments, or definitively coupling complex life stages in natural history studies and faunal inventories. While the full 16S sequence performs best, it requires the use of several primer pairs that increases cost, time, and effort. As a compromise, our results demonstrate that practitioners should utilize medium-length primers in favor of the short-fragment primers because they have the potential to markedly improve phylogenetic inference and species delimitation without additional cost. 
    more » « less
  3. Tropical environments with unique abiotic and biotic factors—such as salt ponds, mangroves, and coral reefs—are often in close proximity. The heterogeneity of these environments is reflected in community shifts over short distances, resulting in high biodiversity. While phytoplankton assemblages physically associated with corals, particularly their symbionts, are well studied, less is known about phytoplankton diversity across tropical aquatic environments. We assess shifts in phytoplankton community composition along inshore to offshore gradients by sequencing and analyzing 16S rRNA gene amplicons using primers targeting the V1-V2 region that capture plastids from eukaryotic phytoplankton and cyanobacteria, as well as heterotrophic bacteria. Microbial alpha diversity computed from 16S V1-V2 amplicon sequence variant (ASV) data from 282 samples collected in and around Curaçao, in the Southern Caribbean Sea, varied more within the dynamic salt ponds, salterns, and mangroves, compared to the seemingly stable above-reef, off-reef, and open sea environments. Among eukaryotic phytoplankton, stramenopiles often exhibited the highest relative abundances in mangrove, above-reef, off-reef, and open sea environments, where cyanobacteria also showed high relative abundances. Within stramenopiles, diatom amplicons dominated in salt ponds and mangroves, while dictyochophytes and pelagophytes prevailed above reefs and offshore. Green algae and cryptophytes were also present, and the former exhibited transitions following the gradient from inland to offshore. Chlorophytes and prasinophyte Class IV dominated in salt ponds, while prasinophyte Class II, including Micromonas commoda and Ostreococcus Clade OII, had the highest relative abundances of green algae in mangroves, above-reef, off-reef, and the open sea. To improve Class II prasinophyte classification, we sequenced 18S rRNA gene amplicons from the V4 region in 41 samples which were used to interrelate plastid-based results with information on uncultured prasinophyte species from prior 18S rRNA gene-based studies. This highlighted the presence of newly described Ostreococcus bengalensis and two Micromonas candidate species. Network analyses identified co-occurrence patterns between individual phytoplankton groups, including cyanobacteria, and heterotrophic bacteria. Our study reveals multiple uncultured and novel lineages within green algae and dictyochophytes in tropical marine habitats. Collectively, the algal diversity patterns and potential co-occurrence relationships observed in connection to physicochemical and spatial influences help provide a baseline against which future change can be assessed. 
    more » « less
  4. Abstract

    Gelatinous zooplankton play a crucial role in marine planktonic food webs. However, primarily due to methodological challenges, the in situ diet of zooplankton remains poorly investigated and little is known about their trophic interactions including feeding behaviour, prey selection and in situ feeding rates. This is particularly true for gelatinous zooplankton including the marine pelagic tunicate,Dolioletta gegenbauri. In this study, we applied an 18S rRNA amplicon metabarcoding approach to identify the diet of captive‐fed and wild‐caughtD. gegenbaurion the midcontinental shelf of the South Atlantic Bight, USA. Sequencing‐based approaches were complimented with targeted quantitative real‐time polymerase chain reaction (PCR) analyses. Captive‐fedD. gegenbaurigut content was dominated by pico‐, nano‐ and micro‐plankton including pico‐dinoflagellates (picozoa) and diatoms. These results suggested that diatoms were concentrated byD. gegenbaurirelative to their concentration in the water column. Analysis of wild‐caught doliolids by quantitative real‐time PCR utilizing a group‐specific diatom primer set confirmed that diatoms were concentrated byD. gegenbauri, particularly by the gonozooid life stage associated with actively developing blooms. Sequences derived from larger metazoans were frequently observed in wild‐caught animals but not in captive‐fed animals suggesting experimental bias associated with captive feeding. These studies revealed that the diet ofD. gegenbauriis considerably more diverse than previously described, that parasites are common in wild populations, and that prey quality, quantity and parasites are likely all important factors in regulating doliolid population dynamics in continental shelf environments.

     
    more » « less
  5. Abstract

    Candidatus Poribacteria is a little-known bacterial phylum, previously characterized by partial genomes from a single sponge host, but never isolated in culture. We have reconstructed multiple genome sequences from four different sponge genera and compared them to recently reported, uncharacterized Poribacteria genomes from the open ocean, discovering shared and unique functional characteristics. Two distinct, habitat-linked taxonomic lineages were identified, designated Entoporibacteria (sponge-associated) and Pelagiporibacteria (free-living). These lineages differed in flagellar motility and chemotaxis genes unique to Pelagiporibacteria, and highly expanded families of restriction endonucleases, DNA methylases, transposases, CRISPR repeats, and toxin–antitoxin gene pairs in Entoporibacteria. Both lineages shared pathways for facultative anaerobic metabolism, denitrification, fermentation, organosulfur compound utilization, type IV pili, cellulosomes, and bacterial proteosomes. Unexpectedly, many features characteristic of eukaryotic host association were also shared, including genes encoding the synthesis of eukaryotic-like cell adhesion molecules, extracellular matrix digestive enzymes, phosphoinositol-linked membrane glycolipids, and exopolysaccharide capsules. Complete Poribacteria 16S rRNA gene sequences were found to contain multiple mismatches to “universal” 16S rRNA gene primer sets, substantiating concerns about potential amplification failures in previous studies. A newly designed primer set corrects these mismatches, enabling more accurate assessment of Poribacteria abundance in diverse marine habitats where it may have previously been overlooked.

     
    more » « less