skip to main content

Title: Incorporating the speciation process into species delimitation
The “multispecies” coalescent (MSC) model that underlies many genomic species-delimitation approaches is problematic because it does not distinguish between genetic structure associated with species versus that of populations within species. Consequently, as both the genomic and spatial resolution of data increases, a proliferation of artifactual species results as within-species population lineages, detected due to restrictions in gene flow, are identified as distinct species. The toll of this extends beyond systematic studies, getting magnified across the many disciplines that rely upon an accurate framework of identified species. Here we present the first of a new class of approaches that addresses this issue by incorporating an extended speciation process for species delimitation. We model the formation of population lineages and their subsequent development into independent species as separate processes and provide for a way to incorporate current understanding of the species boundaries in the system through specification of species identities of a subset of population lineages. As a result, species boundaries and within-species lineages boundaries can be discriminated across the entire system, and species identities can be assigned to the remaining lineages of unknown affinities with quantified probabilities. In addition to the identification of species units in nature, the primary goal of more » species delimitation, the incorporation of a speciation model also allows us insights into the links between population and species-level processes. By explicitly accounting for restrictions in gene flow not only between, but also within, species, we also address the limits of genetic data for delimiting species. Specifically, while genetic data alone is not sufficient for accurate delimitation, when considered in conjunction with other information we are able to not only learn about species boundaries, but also about the tempo of the speciation process itself. « less
Authors:
Editors:
Barraclough, Timothy G.
Award ID(s):
1655607
Publication Date:
NSF-PAR ID:
10229731
Journal Name:
PLoS computational biology
Volume:
17
Issue:
5
Page Range or eLocation-ID:
e1008924
ISSN:
1553-734X
Sponsoring Org:
National Science Foundation
More Like this
  1. Smith, Stephen (Ed.)
    Abstract Understanding how gene flow affects population divergence and speciation remains challenging. Differentiating one evolutionary process from another can be difficult because multiple processes can produce similar patterns, and more than one process can occur simultaneously. Although simple population models produce predictable results, how these processes balance in taxa with patchy distributions and complicated natural histories is less certain. These types of populations might be highly connected through migration (gene flow), but can experience stronger effects of genetic drift and inbreeding, or localized selection. Although different signals can be difficult to separate, the application of high-throughput sequence data can provide the resolution necessary to distinguish many of these processes. We present whole-genome sequence data for an avian species group with an alpine and arctic tundra distribution to examine the role that different population genetic processes have played in their evolutionary history. Rosy-finches inhabit high elevation mountaintop sky islands and high-latitude island and continental tundra. They exhibit extensive plumage variation coupled with low levels of genetic variation. Additionally, the number of species within the complex is debated, making them excellent for studying the forces involved in the process of diversification, as well as an important species group in which to investigatemore »species boundaries. Total genomic variation suggests a broadly continuous pattern of allele frequency changes across the mainland taxa of this group in North America. However, phylogenomic analyses recover multiple distinct, well supported, groups that coincide with previously described morphological variation and current species-level taxonomy. Tests of introgression using D-statistics and approximate Bayesian computation reveal significant levels of introgression between multiple North American taxa. These results provide insight into the balance between divergent and homogenizing population genetic processes and highlight remaining challenges in interpreting conflict between different types of analytical approaches with whole-genome sequence data. [ABBA-BABA; approximate Bayesian computation; gene flow; phylogenomics; speciation; whole-genome sequencing.]« less
  2. In cryptic amphibian complexes, there is a growing trend to equate high levels of genetic structure with hidden cryptic species diversity. Typically, phylogenetic structure and distance-based approaches are used to demonstrate the distinctness of clades and justify the recognition of new cryptic species. However, this approach does not account for gene flow, spatial, and environmental processes that can obfuscate phylogenetic inference and bias species delimitation. As a case study, we sequenced genome-wide exons and introns to evince the processes that underlie the diversification of Philippine Puddle Frogs—a group that is widespread, phenotypically conserved, and exhibits high levels of geographically based genetic structure. We showed that widely adopted tree- and distance-based approaches inferred up to 20 species, compared to genomic analyses that inferred an optimal number of five distinct genetic groups. Using a suite of clustering, admixture, and phylogenetic network analyses, we demonstrate extensive admixture among the five groups and elucidate two specificways in which gene flowcan cause overestimations of species diversity: 1) admixed populations can be inferred as distinct lineages characterized by long branches in phylograms; and 2) admixed lineages can appear to be genetically divergent, even from their parental populations when simple measures of genetic distance are used. Wemore »demonstrate that the relationship between mitochondrial and genome-wide nuclear p-distances is decoupled in admixed clades, leading to erroneous estimates of genetic distances and, consequently, species diversity. Additionally, genetic distance was also biased by spatial and environmental processes. Overall, we showed that high levels of genetic diversity in Philippine Puddle Frogs predominantly comprise metapopulation lineages that arose through complex patterns of admixture, isolation-bydistance, and isolation-by-environment as opposed to species divergence. Our findings suggest that speciation may not be the major process underlying the high levels of hidden diversity observed in many taxonomic groups and that widely adopted tree- and distance-based methods overestimate species diversity in the presence of gene flow.« less
  3. In cryptic amphibian complexes, there is a growing trend to equate high levels of genetic structure with hidden cryptic species diversity. Typically, phylogenetic structure and distance-based approaches are used to demonstrate the distinctness of clades and justify the recognition of new cryptic species. However, this approach does not account for gene flow, spatial, and environmental processes that can obfuscate phylogenetic inference and bias species delimitation. As a case study, we sequenced genome-wide exons and introns to evince the processes that underlie the diversification of Philippine Puddle Frogs—a group that is widespread, phenotypically conserved, and exhibits high levels of geographically based genetic structure. We showed that widely adopted tree- and distance-based approaches inferred up to 20 species, compared to genomic analyses that inferred an optimal number of five distinct genetic groups. Using a suite of clustering, admixture, and phylogenetic network analyses, we demonstrate extensive admixture among the five groups and elucidate two specificways in which gene flowcan cause overestimations of species diversity: 1) admixed populations can be inferred as distinct lineages characterized by long branches in phylograms; and 2) admixed lineages can appear to be genetically divergent, even from their parental populations when simple measures of genetic distance are used. Wemore »demonstrate that the relationship between mitochondrial and genome-wide nuclear p-distances is decoupled in admixed clades, leading to erroneous estimates of genetic distances and, consequently, species diversity. Additionally, genetic distance was also biased by spatial and environmental processes. Overall, we showed that high levels of genetic diversity in Philippine Puddle Frogs predominantly comprise metapopulation lineages that arose through complex patterns of admixture, isolation-bydistance, and isolation-by-environment as opposed to species divergence. Our findings suggest that speciation may not be the major process underlying the high levels of hidden diversity observed in many taxonomic groups and that widely adopted tree- and distance-based methods overestimate species diversity in the presence of gene flow.« less
  4. Identifying the evolutionary and ecological mechanisms that drive lineage diversification in the species-rich tropics is of broad interest to evolutionary biologists. Here, we use phylogeographic and demographic analyses of genomic scale RADseq data to assess the impact of a large geographic feature, the Amazon River, on lineage formation in a venomous pitviper, Bothrops atrox. We compared genetic differentiation in samples from four sites near Santarem, Brazil that spanned the Amazon and represented major habitat types. A species delimitation analysis identified each population as a distinct evolutionary lineage while a species tree analysis with populations as taxa revealed a phylogenetic tree consistent with dispersal across the Amazon from north to south. Phylogenetic analyses of mtDNA variation confirmed this pattern and suggest that all lineages originated during the mid- to late-Pleistocene. Historical demographic analyses support a population model of lineage formation through isolation between lineages with low ongoing migration between large populations and reject a model of differentiation through isolation by distance alone. Our results provide a rare example of a phylogeographic pattern demonstrating dispersal over evolutionary time scales across a large tropical river and suggest a role for the Amazon River as a driver of in-situ divergence by both impeding (butmore »not preventing) gene flow and through parapatric differentiation along an ecological gradient.« less
  5. Abstract Background

    Although originally thought to evolve clonally, studies have revealed that most bacteria exchange DNA. However, it remains unclear to what extent gene flow shapes the evolution of bacterial genomes and maintains the cohesion of species.

    Results

    Here, we analyze the patterns of gene flow within and between >2600 bacterial species. Our results show that fewer than 10% of bacterial species are truly clonal, indicating that purely asexual species are rare in nature. We further demonstrate that the taxonomic criterion of ~95% genome sequence identity routinely used to define bacterial species does not accurately represent a level of divergence that imposes an effective barrier to gene flow across bacterial species. Interruption of gene flow can occur at various sequence identities across lineages, generally from 90 to 98% genome identity. This likely explains why a ~95% genome sequence identity threshold has empirically been judged as a good approximation to define bacterial species. Our results support a universal mechanism where the availability of identical genomic DNA segments required to initiate homologous recombination is the primary determinant of gene flow and species boundaries in bacteria. We show that these barriers of gene flow remain porous since many distinct species maintain some level of genemore »flow, similar to introgression in sexual organisms.

    Conclusions

    Overall, bacterial evolution and speciation are likely shaped by similar forces driving the evolution of sexual organisms. Our findings support a model where the interruption of gene flow—although not necessarily the initial cause of speciation—leads to the establishment of permanent and irreversible species borders.

    « less