skip to main content


Title: Widely used, short 16S rRNA mitochondrial gene fragments yield poor and erratic results in phylogenetic estimation and species delimitation of amphibians
Abstract Background The 16S mitochondrial rRNA gene is the most widely sequenced molecular marker in amphibian systematic studies, making it comparable to the universal CO1 barcode that is more commonly used in other animal groups. However, studies employ different primer combinations that target different lengths/regions of the 16S gene ranging from complete gene sequences (~ 1500 bp) to short fragments (~ 500 bp), the latter of which is the most ubiquitously used. Sequences of different lengths are often concatenated, compared, and/or jointly analyzed to infer phylogenetic relationships, estimate genetic divergence ( p -distances), and justify the recognition of new species (species delimitation), making the 16S gene region, by far, the most influential molecular marker in amphibian systematics. Despite their ubiquitous and multifarious use, no studies have ever been conducted to evaluate the congruence and performance among the different fragment lengths. Results Using empirical data derived from both Sanger-based and genomic approaches, we show that full-length 16S sequences recover the most accurate phylogenetic relationships, highest branch support, lowest variation in genetic distances (pairwise p -distances), and best-scoring species delimitation partitions. In contrast, widely used short fragments produce inaccurate phylogenetic reconstructions, lower and more variable branch support, erratic genetic distances, and low-scoring species delimitation partitions, the numbers of which are vastly overestimated. The relatively poor performance of short 16S fragments is likely due to insufficient phylogenetic information content. Conclusions Taken together, our results demonstrate that short 16S fragments are unable to match the efficacy achieved by full-length sequences in terms of topological accuracy, heuristic branch support, genetic divergences, and species delimitation partitions, and thus, phylogenetic and taxonomic inferences that are predicated on short 16S fragments should be interpreted with caution. However, short 16S fragments can still be useful for species identification, rapid assessments, or definitively coupling complex life stages in natural history studies and faunal inventories. While the full 16S sequence performs best, it requires the use of several primer pairs that increases cost, time, and effort. As a compromise, our results demonstrate that practitioners should utilize medium-length primers in favor of the short-fragment primers because they have the potential to markedly improve phylogenetic inference and species delimitation without additional cost.  more » « less
Award ID(s):
1654388
PAR ID:
10427870
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
BMC Ecology and Evolution
Volume:
22
Issue:
1
ISSN:
2730-7182
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Gilbert, Jack A. (Ed.)
    ABSTRACT Small subunit rRNA (SSU rRNA) amplicon sequencing can quantitatively and comprehensively profile natural microbiomes, representing a critically important tool for studying diverse global ecosystems. However, results will only be accurate if PCR primers perfectly match the rRNA of all organisms present. To evaluate how well marine microorganisms across all 3 domains are detected by this method, we compared commonly used primers with >300 million rRNA gene sequences retrieved from globally distributed marine metagenomes. The best-performing primers compared to 16S rRNA of bacteria and archaea were 515Y/926R and 515Y/806RB, which perfectly matched over 96% of all sequences. Considering cyanobacterial and chloroplast 16S rRNA, 515Y/926R had the highest coverage (99%), making this set ideal for quantifying marine primary producers. For eukaryotic 18S rRNA sequences, 515Y/926R also performed best (88%), followed by V4R/V4RB (18S rRNA specific; 82%)—demonstrating that the 515Y/926R combination performs best overall for all 3 domains. Using Atlantic and Pacific Ocean samples, we demonstrate high correspondence between 515Y/926R amplicon abundances (generated for this study) and metagenomic 16S rRNA (median R 2 = 0.98, n  = 272), indicating amplicons can produce equally accurate community composition data compared with shotgun metagenomics. Our analysis also revealed that expected performance of all primer sets could be improved with minor modifications, pointing toward a nearly completely universal primer set that could accurately quantify biogeochemically important taxa in ecosystems ranging from the deep sea to the surface. In addition, our reproducible bioinformatic workflow can guide microbiome researchers studying different ecosystems or human health to similarly improve existing primers and generate more accurate quantitative amplicon data. IMPORTANCE PCR amplification and sequencing of marker genes is a low-cost technique for monitoring prokaryotic and eukaryotic microbial communities across space and time but will work optimally only if environmental organisms match PCR primer sequences exactly. In this study, we evaluated how well primers match globally distributed short-read oceanic metagenomes. Our results demonstrate that primer sets vary widely in performance, and that at least for marine systems, rRNA amplicon data from some primers lack significant biases compared to metagenomes. We also show that it is theoretically possible to create a nearly universal primer set for diverse saline environments by defining a specific mixture of a few dozen oligonucleotides, and present a software pipeline that can guide rational design of primers for any environment with available meta’omic data. 
    more » « less
  2. Abstract

    Telomere length dynamics are an established biomarker of health and ageing in animals. The study of telomeres in numerous species has been facilitated by methods to measure telomere length by real‐time quantitative PCR (qPCR). In this method, telomere length is determined by quantifying the amount of telomeric DNA repeats in a sample and normalizing this to the total amount of genomic DNA. This normalization requires the development of genomic reference primers suitable for qPCR, which remains challenging in nonmodel organism with genomes that have not been sequenced. Here we report reference primers that can be used in qPCR to measure telomere lengths in any vertebrate species. We designed primer pairs to amplify genetic elements that are highly conserved between evolutionarily distant taxa and tested them in species that span the vertebrate tree of life. We report five primer pairs that meet the specificity and reproducibility standards of qPCR. In addition, we demonstrate an approach to choose the best primers for a given species by testing the primers on multiple individuals within a species and then applying an established computational tool. These reference primers can facilitate qPCR‐based telomere length measurements in any vertebrate species of ecological or economic interest.

     
    more » « less
  3. Burbrink, Frank (Ed.)
    Abstract In cryptic amphibian complexes, there is a growing trend to equate high levels of genetic structure with hidden cryptic species diversity. Typically, phylogenetic structure and distance-based approaches are used to demonstrate the distinctness of clades and justify the recognition of new cryptic species. However, this approach does not account for gene flow, spatial, and environmental processes that can obfuscate phylogenetic inference and bias species delimitation. As a case study, we sequenced genome-wide exons and introns to evince the processes that underlie the diversification of Philippine Puddle Frogs—a group that is widespread, phenotypically conserved, and exhibits high levels of geographically based genetic structure. We showed that widely adopted tree- and distance-based approaches inferred up to 20 species, compared to genomic analyses that inferred an optimal number of five distinct genetic groups. Using a suite of clustering, admixture, and phylogenetic network analyses, we demonstrate extensive admixture among the five groups and elucidate two specific ways in which gene flow can cause overestimations of species diversity: 1) admixed populations can be inferred as distinct lineages characterized by long branches in phylograms; and 2) admixed lineages can appear to be genetically divergent, even from their parental populations when simple measures of genetic distance are used. We demonstrate that the relationship between mitochondrial and genome-wide nuclear $p$-distances is decoupled in admixed clades, leading to erroneous estimates of genetic distances and, consequently, species diversity. Additionally, genetic distance was also biased by spatial and environmental processes. Overall, we showed that high levels of genetic diversity in Philippine Puddle Frogs predominantly comprise metapopulation lineages that arose through complex patterns of admixture, isolation-by-distance, and isolation-by-environment as opposed to species divergence. Our findings suggest that speciation may not be the major process underlying the high levels of hidden diversity observed in many taxonomic groups and that widely adopted tree- and distance-based methods overestimate species diversity in the presence of gene flow. [Cryptic species; gene flow; introgression; isolation-by-distance; isolation-by-environment; phylogenetic network; species delimitation.] 
    more » « less
  4. Abstract

    Using sequences from 2,615 ultraconserved element (UCE) loci and multiple methodologies we inferred phylogenies for the largest genetic data set of New World bats in the genus Myotis to date. The resulting phylogenetic trees were populated with short branch lengths and widespread conflict, hallmarks consistent with rapid adaptive radiations. The degree of conflict observed in Myotis has likely contributed to difficulties disentangling deeper evolutionary relationships. Unlike earlier phylogenies based on 1 to 2 gene sequences, this UCE data set places M. brandtii outside the New World clades. Introgression testing of a small subset of our samples revealed evidence of historical but not contemporary gene flow, suggesting that hybridization occurs less frequently in the Neotropics than the Nearctic. We identified several instances of cryptic lineages within described species as well as several instances of potential taxonomic oversplitting. Evidence from Central and South American localities suggests that diversity in those regions is not fully characterized. In light of the accumulated evidence of the evolutionary complexity in Myotis and our survey of the taxonomic implications from our phylogenies, it is apparent that the definition of species and regime of species delimitation need to be reevaluated for Myotis. This will require substantial collaboration and sample sharing between geneticists and taxonomists to build a system that is both robust and applicable in a genus as diverse as Myotis.

     
    more » « less
  5. Munderloh, Ulrike Gertrud (Ed.)
    Microorganisms, including rotifers, are thought to be capable of long distance dispersal. Therefore, they should show little population genetic structure due to high gene flow. Nevertheless, substantial genetic structure has been reported among populations of many taxa. In rotifers, genetic studies have focused on planktonic taxa leaving sessile groups largely unexplored. Here, we used COI gene and ITS region sequences to study genetic structure and delimit cryptic species in two sessile species (Limnias melicerta [32 populations]; L. ceratophylli [21 populations]). Among populations, ITS region sequences were less variable as compared to those of the COI gene (ITS; L. melicerta: 0–3.1% and L. ceratophylli: 0–4.4%; COI; L. melicerta: 0–22.7% and L. ceratophylli: 0–21.7%). Moreover, L. melicerta and L. ceratophylli were not resolved in phylogenetic analyses based on ITS sequences. Thus, we used COI sequences for species delimitation. Bayesian Species Delimitation detected nine putative cryptic species within L. melicerta and four putative cryptic species for L. ceratophylli. The genetic distance in the COI gene was 0–15.4% within cryptic species of L. melicerta and 0.5–0.6% within cryptic species of L. ceratophylli. Among cryptic species, COI genetic distance ranged 8.1–21.9% for L. melicerta and 15.1–21.2% for L. ceratophylli. The correlation between geographic and genetic distance was weak or lacking; thus geographic isolation cannot be considered a strong driver of genetic variation. In addition, geometric morphometric analyses of trophi did not show significant variation among cryptic species. In this study we used a conservative approach for species delimitation, yet we were able to show that species diversity in these sessile rotifers is underestimated. 
    more » « less