skip to main content


Title: Ranking the biases: The choice of OTUs vs. ASVs in 16S rRNA amplicon data analysis has stronger effects on diversity measures than rarefaction and OTU identity threshold
Advances in the analysis of amplicon sequence datasets have introduced a methodological shift in how research teams investigate microbial biodiversity, away from sequence identity-based clustering (producing Operational Taxonomic Units, OTUs) to denoising methods (producing amplicon sequence variants, ASVs). While denoising methods have several inherent properties that make them desirable compared to clustering-based methods, questions remain as to the influence that these pipelines have on the ecological patterns being assessed, especially when compared to other methodological choices made when processing data (e.g. rarefaction) and computing diversity indices. We compared the respective influences of two widely used methods, namely DADA2 (a denoising method) vs. Mothur (a clustering method) on 16S rRNA gene amplicon datasets (hypervariable region v4), and compared such effects to the rarefaction of the community table and OTU identity threshold (97% vs. 99%) on the ecological signals detected. We used a dataset comprising freshwater invertebrate (three Unionidae species) gut and environmental (sediment, seston) communities sampled in six rivers in the southeastern USA. We ranked the respective effects of each methodological choice on alpha and beta diversity, and taxonomic composition. The choice of the pipeline significantly influenced alpha and beta diversities and changed the ecological signal detected, especially on presence/absence indices such as the richness index and unweighted Unifrac. Interestingly, the discrepancy between OTU and ASV-based diversity metrics could be attenuated by the use of rarefaction. The identification of major classes and genera also revealed significant discrepancies across pipelines. Compared to the pipeline’s effect, OTU threshold and rarefaction had a minimal impact on all measurements.  more » « less
Award ID(s):
1831531
NSF-PAR ID:
10346895
Author(s) / Creator(s):
; ; ;
Editor(s):
Moreno-Hagelsieb, Gabriel
Date Published:
Journal Name:
PLOS ONE
Volume:
17
Issue:
2
ISSN:
1932-6203
Page Range / eLocation ID:
e0264443
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Fields, David (Ed.)
    Abstract Community-based diversity analyses, such as metabarcoding, are increasingly popular in the field of metazoan zooplankton community ecology. However, some of the methodological uncertainties remain, such as the potential inflation of diversity estimates resulting from contamination by pseudogene sequences. Furthermore, primer affinity to specific taxonomic groups might skew community composition and structure during PCR. In this study, we estimated OTU (operational taxonomic unit) richness, Shannon’s H’, and the phylum-level community composition of samples from a coastal zooplankton community using four approaches: complement DNA (cDNA) and genomic DNA (gDNA) mitochondrial COI (Cytochrome oxidase subunit I) gene amplicon, metatranscriptome sequencing, and morphological identification. Results of mismatch distribution demonstrated that 90% is good threshold percentage to differentiate intra- and inter-species. Moderate level of correlations appeared upon comparing the species/OTU richness estimated from the different methods. Results strongly indicated that diversity inflation occurred in the samples amplified from gDNA because of mitochondrial pseudogene contamination (overall, gDNA produced two times more richness compared with cDNA amplicons). The unique community compositions observed in the PCR-based methods indicated that taxonomic amplification bias had occurred during the PCR. Therefore, it is recommended that PCR-free approaches be used whenever resolving community structure represents an essential aspect of the analysis. 
    more » « less
  2. Introduction Soil microbial communities, including biological soil crust microbiomes, play key roles in water, carbon and nitrogen cycling, biological weathering, and other nutrient releasing processes of desert ecosystems. However, our knowledge of microbial distribution patterns and ecological drivers is still poor, especially so for the Chihuahuan Desert. Methods This project investigated the effects of trampling disturbance on surface soil microbiomes, explored community composition and structure, and related patterns to abiotic and biotic landscape characteristics within the Chihuahuan Desert biome. Composite soil samples were collected in disturbed and undisturbed areas of 15 long-term ecological research plots in the Jornada Basin, New Mexico. Microbial diversity of cross-domain microbial groups (total Bacteria, Cyanobacteria, Archaea, and Fungi) was obtained via DNA amplicon metabarcode sequencing. Sequence data were related to landscape characteristics including vegetation type, landforms, ecological site and state as well as soil properties including gravel content, soil texture, pH, and electrical conductivity. Results Filamentous Cyanobacteria dominated the photoautotrophic community while Proteobacteria and Actinobacteria dominated among the heterotrophic bacteria. Thaumarchaeota were the most abundant Archaea and drought adapted taxa in Dothideomycetes and Agaricomycetes were most abundant fungi in the soil surface microbiomes. Apart from richness within Archaea ( p  = 0.0124), disturbed samples did not differ from undisturbed samples with respect to alpha diversity and community composition ( p  ≥ 0.05), possibly due to a lack of frequent or impactful disturbance. Vegetation type and landform showed differences in richness of Bacteria, Archaea, and Cyanobacteria but not in Fungi. Richness lacked strong relationships with soil variables. Landscape features including parent material, vegetation type, landform type, and ecological sites and states, exhibited stronger influence on relative abundances and microbial community composition than on alpha diversity, especially for Cyanobacteria and Fungi. Soil texture, moisture, pH, electrical conductivity, lichen cover, and perennial plant biomass correlated strongly with microbial community gradients detected in NMDS ordinations. Discussion Our study provides first comprehensive insights into the relationships between landscape characteristics, associated soil properties, and cross-domain soil microbiomes in the Chihuahuan Desert. Our findings will inform land management and restoration efforts and aid in the understanding of processes such as desertification and state transitioning, which represent urgent ecological and economical challenges in drylands around the world. 
    more » « less
  3. 16S rRNA gene profiling (amplicon sequencing) is a popular technique for understanding host-associated and environmental microbial communities. Most protocols for sequencing amplicon libraries follow a standardized pipeline that can differ slightly depending on laboratory facility and user. Given that the same variable region of the 16S gene is targeted, it is generally accepted that sequencing output from differing protocols are comparable and this assumption underlies our ability to identify universal patterns in microbial dynamics through meta-analyses. However, discrepant results from a combined 16S rRNA gene dataset prepared by two labs whose protocols differed only in DNA polymerase and sequencing platform led us to scrutinize the outputs and challenge the idea of confidently combining them for standard microbiome analysis. Using technical replicates of reef-building coral samples from two species, Montipora aequituberculata and Porites lobata , we evaluated the consistency of alpha and beta diversity metrics between data resulting from these highly similar protocols. While we found minimal variation in alpha diversity between platform, significant differences were revealed with most beta diversity metrics, dependent on host species. These inconsistencies persisted following removal of low abundance taxa and when comparing across higher taxonomic levels, suggesting that bacterial community differences associated with sequencing protocol are likely to be context dependent and difficult to correct without extensive validation work. The results of this study encourage caution in the statistical comparison and interpretation of studies that combine rRNA gene sequence data from distinct protocols and point to a need for further work identifying mechanistic causes of these observed differences. 
    more » « less
  4. null (Ed.)
    Firmicutes is almost a ubiquitous phylum. Several genera of this group, for instance, Geobacillus, are recognized for decomposing plant organic matter and for producing thermostable ligninolytic enzymes. Amplicon sequencing was used in this study to determine the prevalence and genetic diversity of the Firmicutes in two distinctly related environmental samples—South Dakota Landfill Compost (SDLC, 60 °C), and Sanford Underground Research Facility sediments (SURF, 45 °C). Although distinct microbial community compositions were observed, there was a dominance of Firmicutes in both the SDLC and SURF samples, followed by Proteobacteria. The abundant classes of bacteria in the SDLC site, within the phylum Firmicutes, were Bacilli (83.2%), and Clostridia (2.9%). In comparison, the sample from the SURF mine was dominated by the Clostridia (45.8%) and then Bacilli (20.1%). Within the class Bacilli, the SDLC sample had more diversity (a total of 11 genera with more than 1% operational taxonomic unit, OTU). On the other hand, SURF samples had just three genera, about 1% of the total population: Bacilli, Paenibacillus, and Solibacillus. With specific regard to Geobacillus, it was found to be present at a level of 0.07% and 2.5% in SURF and SDLC, respectively. Subsequently, culture isolations of endospore-forming Firmicutes members from these samples led to the isolation of a total of 117 isolates. According to colony morphologies, and identification based upon 16S rRNA and gyrB gene sequence analysis, we obtained 58 taxonomically distinct strains. Depending on the similarity indexes, a gyrB sequence comparison appeared more useful than 16S rRNA sequence analysis for inferring intra- and some intergeneric relationships between the isolates. 
    more » « less
  5. Abstract

    Biodiversity is changing at an accelerating rate at both local and regional scales. Beta diversity, which quantifies species turnover between these two scales, is emerging as a key driver of ecosystem function that can inform spatial conservation. Yet measuring biodiversity remains a major challenge, especially in aquatic ecosystems. Decoding environmental DNA (eDNA) left behind by organisms offers the possibility of detecting species sans direct observation, a Rosetta Stone for biodiversity. While eDNA has proven useful to illuminate diversity in aquatic ecosystems, its utility for measuring beta diversity over spatial scales small enough to be relevant to conservation purposes is poorly known. Here we tested how eDNA performs relative to underwater visual census (UVC) to evaluate beta diversity of marine communities. We paired UVC with 12S eDNA metabarcoding and used a spatially structured hierarchical sampling design to assess key spatial metrics of fish communities on temperate rocky reefs in southern California. eDNA provided a more-detailed picture of the main sources of spatial variation in both taxonomic richness and community turnover, which primarily arose due to strong species filtering within and among rocky reefs. As expected, eDNA detected more taxa at the regional scale (69 vs. 38) which accumulated quickly with space and plateaued at only ~ 11 samples. Conversely, the discovery rate of new taxa was slower with no sign of saturation for UVC. Based on historical records in the region (2000–2018) we found that 6.9 times more UVC samples would be required to detect 50 taxa compared to eDNA. Our results show that eDNA metabarcoding can outperform diver counts to capture the spatial patterns in biodiversity at fine scales with less field effort and more power than traditional methods, supporting the notion that eDNA is a critical scientific tool for detecting biodiversity changes in aquatic ecosystems.

     
    more » « less