skip to main content


Title: Consilience Across Multiple, Independent Genomic Data Sets Reveals Species in a Complex with Limited Phenotypic Variation
Abstract

Species delimitation in the genomic era has focused predominantly on the application of multiple analytical methodologies to a single massive parallel sequencing (MPS) data set, rather than leveraging the unique but complementary insights provided by different classes of MPS data. In this study, we demonstrate how the use of two independent MPS data sets, a sequence capture data set and a single-nucleotide polymorphism (SNP) data set generated via genotyping-by-sequencing, enables the resolution of species in three complexes belonging to the grass genus Ehrharta, whose strong population structure and subtle morphological variation limit the effectiveness of traditional species delimitation approaches. Sequence capture data are used to construct a comprehensive phylogenetic tree of Ehrharta and to resolve population relationships within the focal clades, while SNP data are used to detect patterns of gene pool sharing across populations, using a novel approach that visualizes multiple values of K. Given that the two genomic data sets are independent, the strong congruence in the clusters they resolve provides powerful ratification of species boundaries in all three complexes studied. Our approach is also able to resolve a number of single-population species and a probable hybrid species, both of which would be difficult to detect and characterize using a single MPS data set. Overall, the data reveal the existence of 11 and five species in the E. setacea and E. rehmannii complexes, with the E. ramosa complex requiring further sampling before species limits are finalized. Despite phenotypic differentiation being generally subtle, true crypsis is limited to just a few species pairs and triplets. We conclude that, in the absence of strong morphological differentiation, the use of multiple, independent genomic data sets is necessary in order to provide the cross-data set corroboration that is foundational to an integrative taxonomic approach. [Species delimitation; genotyping-by-sequencing; population structure; integrative taxonomy; cryptic species; Ehrharta (Poaceae).]

 
more » « less
Award ID(s):
1937604
NSF-PAR ID:
10479465
Author(s) / Creator(s):
; ;
Editor(s):
Yang, Ya
Publisher / Repository:
Oxford Academic
Date Published:
Journal Name:
Systematic Biology
Volume:
72
Issue:
4
ISSN:
1063-5157
Page Range / eLocation ID:
753 to 766
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Marine metapopulations often exhibit subtle population structure that can be difficult to detect. Given recent advances in high‐throughput sequencing, an emerging question is whether various genotyping approaches, in concert with improved sampling designs, will substantially improve our understanding of genetic structure in the sea. To address this question, we explored hierarchical patterns of structure in the coral reef fishElacatinus loriusing a high‐resolution approach with respect to both genetic and geographic sampling. Previously, we identified three putativeE. loripopulations within Belize using traditional genetic markers and sparse geographic sampling: barrier reef and Turneffe Atoll; Glover's Atoll; and Lighthouse Atoll. Here, we systematically sampled individuals at ~10 km intervals throughout these reefs (1,129 individuals from 35 sites) and sequenced all individuals at three sets of markers: 2,418 SNPs; 89 microsatellites; and 57 nonrepetitive nuclear loci. At broad spatial scales, the markers were consistent with each other and with previous findings. At finer spatial scales, there was new evidence of genetic substructure, but our three marker sets differed slightly in their ability to detect these patterns. Specifically, we found subtle structure between the barrier reef and Turneffe Atoll, with SNPs resolving this pattern most effectively. We also documented isolation by distance within the barrier reef. Sensitivity analyses revealed that the number of loci (and alleles) had a strong effect on the detection of structure for all three marker sets, particularly at small spatial scales. Taken together, these results illustrate empirically that high‐throughput genotyping data can elucidate subtle genetic structure at previously‐undetected scales in a dispersive marine fish.

     
    more » « less
  2. Lineage-based species definitions applying coalescent approaches to species delimitation have become increasingly popular. Yet, the application of these methods and the recognition of lineage-only definitions have recently been questioned. Species delimitation criteria that explicitly consider both lineages and evidence for ecological role shifts provide an opportunity to incorporate ecologically meaningful data from multiple sources in studies of species boundaries. Here, such criteria were applied to a problematic group of mycoheterotrophic orchids, the Corallorhiza striata complex, analysing genomic, morphological, phenological, reproductive-mode, niche, and fungal host data. A recently developed method for generating genomic polymorphism data-ISSRseq-demonstrates evidence for four distinct lineages, including a previously unidentified lineage in the Coast Ranges and Cascades of California and Oregon, USA. There is divergence in morphology, phenology, reproductive mode, and fungal associates among the four lineages. Integrative analyses, conducted in population assignment and redundancy analysis frameworks, provide evidence of distinct genomic lineages and a similar pattern of divergence in the extended data, albeit with weaker signal. However, none of the extended data sets fully satisfy the condition of a significant role shift, which requires evidence of fixed differences. The four lineages identified in the current study are recognized at the level of variety, short of comprising different species. This study represents the most comprehensive application of lineage + role to date and illustrates the advantages of such an approach. 
    more » « less
  3. Abstract

    Significant advances have been made in species delimitation and numerous methods can test precisely defined models of speciation, though the synthesis of phylogeography and taxonomy is still sometimes incomplete. Emerging consensus treats distinct genealogical clusters in genome-scale data as strong initial evidence of speciation in most cases, a hypothesis that must therefore be falsified under an explicit evolutionary model. We can now test speciation hypotheses linking trait differentiation to specific mechanisms of divergence with increasingly large data sets. Integrative taxonomy can, therefore, reflect an understanding of how each axis of variation relates to underlying speciation processes, with nomenclature for distinct evolutionary lineages. We illustrate this approach here with Seal Salamanders (Desmognathus monticola) and introduce a new unsupervised machine-learning approach for species delimitation. Plethodontid salamanders are renowned for their morphological conservatism despite extensive phylogeographic divergence. We discover 2 geographic genetic clusters, for which demographic and spatial models of ecology and gene flow provide robust support for ecogeographic speciation despite limited phenotypic divergence. These data are integrated under evolutionary mechanisms (e.g., spatially localized gene flow with reduced migration) and reflected in emergent properties expected under models of reinforcement (e.g., ethological isolation and selection against hybrids). Their genetic divergence is prima facie evidence for species-level distinctiveness, supported by speciation models and divergence along axes such as behavior, geography, and climate that suggest an ecological basis with subsequent reinforcement through prezygotic isolation. As data sets grow more comprehensive, species-delimitation models can be tested, rejected, or corroborated as explicit speciation hypotheses, providing for reciprocal illumination of evolutionary processes and integrative taxonomies. [Desmognathus; integrative taxonomy; machine learning; species delimitation.]

     
    more » « less
  4. Abstract

    The relative roles of rivers versus refugia in shaping the high levels of species diversity in tropical rainforests have been widely debated for decades. Only recently has it become possible to take an integrative approach to test predictions derived from these hypotheses using genomic sequencing and paleo‐species distribution modeling. Herein, we tested the predictions of the classic river, refuge, and river‐refuge hypotheses on diversification in the arboreal sub‐Saharan African snake genusToxicodryas. We used dated phylogeographic inferences, population clustering analyses, demographic model selection, and paleo‐distribution modeling to conduct a phylogenomic and historical demographic analysis of this genus. Our results revealed significant population genetic structure within bothToxicodryasspecies, corresponding geographically to river barriers and divergence times from the mid‐Miocene to Pliocene. Our demographic analyses supported the interpretation that rivers are indications of strong barriers to gene flow among populations since their divergence. Additionally, we found no support for a major contraction of suitable habitat during the last glacial maximum, allowing us to reject both the refuge and river‐refuge hypotheses in favor of the river‐barrier hypothesis. Based on conservative interpretations of our species delimitation analyses with the Sanger and ddRAD data sets, two new cryptic species are identified from east‐central Africa. This study highlights the complexity of diversification dynamics in the African tropics and the advantages of integrative approaches to studying speciation in tropical regions.

     
    more » « less
  5. Phylogenomic investigations of biodiversity facilitate the detection of fine-scale population genetic structure and the demographic histories of species and populations. However, determining whether or not the genetic divergence measured among populations reflects species-level differentiation remains a central challenge in species delimitation. One potential solution is to compare genetic divergence between putative new species with other closely related species, sometimes referred to as a reference-based taxonomy. To be described as a new species, a population should be at least as divergent as other species. Here, we develop a reference-based taxonomy for Horned Lizards ( Phrynosoma ; 17 species) using phylogenomic data (ddRADseq data) to provide a framework for delimiting species in the Greater Short-horned Lizard species complex ( P. hernandesi ). Previous species delimitation studies of this species complex have produced conflicting results, with morphological data suggesting that P. hernandesi consists of five species, whereas mitochondrial DNA support anywhere from 1 to 10 + species. To help address this conflict, we first estimated a time-calibrated species tree for P. hernandesi and close relatives using SNP data. These results support the paraphyly of P. hernandesi; we recommend the recognition of two species to promote a taxonomy that is consistent with species monophyly. There is strong evidence for three populations within P. hernandesi , and demographic modeling and admixture analyses suggest that these populations are not reproductively isolated, which is consistent with previous morphological analyses that suggest hybridization could be common. Finally, we characterize the population-species boundary by quantifying levels of genetic divergence for all 18 Phrynosoma species. Genetic divergence measures for western and southern populations of P. hernandesi failed to exceed those of other Phrynosoma species, but the relatively small population size estimated for the northern population causes it to appear as a relatively divergent species. These comparisons underscore the difficulties associated with putting a reference-based approach to species delimitation into practice. Nevertheless, the reference-based approach offers a promising framework for the consistent assessment of biodiversity within clades of organisms with similar life histories and ecological traits. 
    more » « less