skip to main content

Title: Evaluating and Improving Small Subunit rRNA PCR Primer Coverage for Bacteria, Archaea, and Eukaryotes Using Metagenomes from Global Ocean Surveys
ABSTRACT Small subunit rRNA (SSU rRNA) amplicon sequencing can quantitatively and comprehensively profile natural microbiomes, representing a critically important tool for studying diverse global ecosystems. However, results will only be accurate if PCR primers perfectly match the rRNA of all organisms present. To evaluate how well marine microorganisms across all 3 domains are detected by this method, we compared commonly used primers with >300 million rRNA gene sequences retrieved from globally distributed marine metagenomes. The best-performing primers compared to 16S rRNA of bacteria and archaea were 515Y/926R and 515Y/806RB, which perfectly matched over 96% of all sequences. Considering cyanobacterial and chloroplast 16S rRNA, 515Y/926R had the highest coverage (99%), making this set ideal for quantifying marine primary producers. For eukaryotic 18S rRNA sequences, 515Y/926R also performed best (88%), followed by V4R/V4RB (18S rRNA specific; 82%)—demonstrating that the 515Y/926R combination performs best overall for all 3 domains. Using Atlantic and Pacific Ocean samples, we demonstrate high correspondence between 515Y/926R amplicon abundances (generated for this study) and metagenomic 16S rRNA (median R 2 = 0.98, n  = 272), indicating amplicons can produce equally accurate community composition data compared with shotgun metagenomics. Our analysis also revealed that expected performance of all primer sets could be more » improved with minor modifications, pointing toward a nearly completely universal primer set that could accurately quantify biogeochemically important taxa in ecosystems ranging from the deep sea to the surface. In addition, our reproducible bioinformatic workflow can guide microbiome researchers studying different ecosystems or human health to similarly improve existing primers and generate more accurate quantitative amplicon data. IMPORTANCE PCR amplification and sequencing of marker genes is a low-cost technique for monitoring prokaryotic and eukaryotic microbial communities across space and time but will work optimally only if environmental organisms match PCR primer sequences exactly. In this study, we evaluated how well primers match globally distributed short-read oceanic metagenomes. Our results demonstrate that primer sets vary widely in performance, and that at least for marine systems, rRNA amplicon data from some primers lack significant biases compared to metagenomes. We also show that it is theoretically possible to create a nearly universal primer set for diverse saline environments by defining a specific mixture of a few dozen oligonucleotides, and present a software pipeline that can guide rational design of primers for any environment with available meta’omic data. « less
Authors:
; ; ;
Editors:
Gilbert, Jack A.
Award ID(s):
1737409
Publication Date:
NSF-PAR ID:
10277548
Journal Name:
mSystems
Volume:
6
Issue:
3
ISSN:
2379-5077
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Community dynamics are central in microbial ecology, yet we lack studies comparing diversity patterns among marine protists and prokaryotes over depth and multiple years. Here, we characterized microbes at the San-Pedro Ocean Time series (2005–2018), using SSU rRNA gene sequencing from two size fractions (0.2–1 and 1–80 μm), with a universal primer set that amplifies from both prokaryotes and eukaryotes, allowing direct comparisons of diversity patterns in a single set of analyses. The 16S + 18S rRNA gene composition in the small size fraction was mostly prokaryotic (>92%) as expected, but the large size fraction unexpectedly contained 46–93% prokaryotic 16S rRNA genes. Prokaryotes and protists showed opposite vertical diversity patterns; prokaryotic diversity peaked at mid-depth, protistan diversity at the surface. Temporal beta-diversity patterns indicated prokaryote communities were much more stable than protists. Although the prokaryotic communities changed monthly, the average community stayed remarkably steady over 14 years, showing high resilience. Additionally, particle-associated prokaryotes were more diverse than smaller free-living ones, especially at deeper depths, contributed unexpectedly by abundant and diverse SAR11 clade II. Eukaryotic diversity was strongly correlated with the diversity of particle-associated prokaryotes but not free-living ones, reflecting that physical associations result in the strongest interactions, including symbioses, parasitism, and decomposermore »relationships.

    « less
  2. High-throughput amplicon sequencing that primarily targets the 16S ribosomal DNA (rDNA) (for bacteria and archaea) and the Internal Transcribed Spacer rDNA (for fungi) have facilitated microbial community discovery across diverse environments. A three-step PCR that utilizes flexible primer choices to construct the library for Illumina amplicon sequencing has been applied to several studies in forest and agricultural systems. The three-step PCR protocol, while producing high-quality reads, often yields a large number (up to 46%) of reads that are unable to be assigned to a specific sample according to its barcode. Here, we improve this technique through an optimized two-step PCR protocol. We tested and compared the improved two-step PCR meta-barcoding protocol against the three-step PCR protocol using four different primer pairs (fungal ITS: ITS1F-ITS2 and ITS1F-ITS4, and bacterial 16S: 515F-806R and 341F-806R). We demonstrate that the sequence quantity and recovery rate were significantly improved with the two-step PCR approach (fourfold more read counts per sample; determined reads ≈90% per run) while retaining high read quality (Q30 > 80%). Given that synthetic barcodes are incorporated independently from any specific primers, this two-step PCR protocol can be broadly adapted to different genomic regions and organisms of scientific interest.
  3. Planktonic microbial communities mediate many vital biogeochemical processes in wetland ecosystems, yet compared to other aquatic ecosystems, like oceans, lakes, rivers or estuaries, they remain relatively underexplored. Our study site, the Florida Everglades (USA)—a vast iconic wetland consisting of a slow-moving system of shallow rivers connecting freshwater marshes with coastal mangrove forests and seagrass meadows—is a highly threatened model ecosystem for studying salinity and nutrient gradients, as well as the effects of sea level rise and saltwater intrusion. This study provides the first high-resolution phylogenetic profiles of planktonic bacterial and eukaryotic microbial communities (using 16S and 18S rRNA gene amplicons) together with nutrient concentrations and environmental parameters at 14 sites along two transects covering two distinctly different drainages: the peat-based Shark River Slough (SRS) and marl-based Taylor Slough/Panhandle (TS/Ph). Both bacterial as well as eukaryotic community structures varied significantly along the salinity gradient. Although freshwater communities were relatively similar in both transects, bacterioplankton community composition at the ecotone (where freshwater and marine water mix) differed significantly. The most abundant taxa in the freshwater marshes include heterotrophic Polynucleobacter sp. and potentially phagotrophic cryptomonads of the genus Chilomonas, both of which could be key players in the transfer of detritus-based biomass tomore »higher trophic levels.« less
  4. The microbiomes of tropical corals are actively studied using 16S rRNA gene amplicons to understand microbial roles in coral health, metabolism, and disease resistance. However, due to the prokaryotic origins of mitochondria, primers targeting bacterial and archaeal 16S rRNA genes may also amplify homologous 12S mitochondrial rRNA genes from the host coral, associated microbial eukaryotes, and encrusting organisms. Standard microbial bioinformatics pipelines attempt to identify and remove these sequences by comparing them to reference taxonomies. However, commonly used tools have severely under-annotated mitochondrial sequences in 1440 coral microbiomes from the Global Coral Microbiome Project, preventing annotation of over 95% of reads in some samples. This issue persists when using Greengenes or SILVA prokaryotic reference taxonomies, and in other hosts, including 16S studies of vertebrates, and of marine sponges. Worse, mitochondrial under-annotation varies between coral families and across coral compartments, biasing comparisons of  - and  -diversity. By supplementing existing reference taxonomies with over 3000 animal mitochondrial rRNA gene sequences, we resolved roughly 97% of unique unclassified sequences as mitochondrial. These additional sequences did not cause a false elevation in mitochondrial annotations in mock communities with known compositions. We recommend using these extended taxonomies for coral microbiome analysis and whenever eukaryotic contaminationmore »may be a concern.« less
  5. An inherent issue in high-throughput rRNA gene tag sequencing microbiome surveys is that they provide compositional data in relative abundances. This often leads to spurious correlations, making the interpretation of relationships to biogeochemical rates challenging. To overcome this issue, we quantitatively estimated the abundance of microorganisms by spiking in known amounts of internal DNA standards. Using a 3-year sample set of diverse microbial communities from the Western Antarctica Peninsula, we demonstrated that the internal standard method yielded community profiles and taxon cooccurrence patterns substantially different from those derived using relative abundances. We found that the method provided results consistent with the traditional CHEMTAX analysis of pigments and total bacterial counts by flow cytometry. Using the internal standard method, we also showed that chloroplast 16S rRNA gene data in microbial surveys can be used to estimate abundances of certain eukaryotic phototrophs such as cryptophytes and diatoms. In Phaeocystis, scatter in the 16S/18S rRNA gene ratio may be explained by physiological adaptation to environmental conditions. We conclude that the internal standard method, when applied to rRNA gene microbial community profiling, is quantitative and that its application will substantially improve our understanding of microbial ecosystems.