Phylogenomic investigations of biodiversity facilitate the detection of fine-scale population genetic structure and the demographic histories of species and populations. However, determining whether or not the genetic divergence measured among populations reflects species-level differentiation remains a central challenge in species delimitation. One potential solution is to compare genetic divergence between putative new species with other closely related species, sometimes referred to as a reference-based taxonomy. To be described as a new species, a population should be at least as divergent as other species. Here, we develop a reference-based taxonomy for Horned Lizards ( Phrynosoma ; 17 species) using phylogenomic data (ddRADseq data) to provide a framework for delimiting species in the Greater Short-horned Lizard species complex ( P. hernandesi ). Previous species delimitation studies of this species complex have produced conflicting results, with morphological data suggesting that P. hernandesi consists of five species, whereas mitochondrial DNA support anywhere from 1 to 10 + species. To help address this conflict, we first estimated a time-calibrated species tree for P. hernandesi and close relatives using SNP data. These results support the paraphyly of P. hernandesi; we recommend the recognition of two species to promote a taxonomy that is consistent with species monophyly. There is strong evidence for three populations within P. hernandesi , and demographic modeling and admixture analyses suggest that these populations are not reproductively isolated, which is consistent with previous morphological analyses that suggest hybridization could be common. Finally, we characterize the population-species boundary by quantifying levels of genetic divergence for all 18 Phrynosoma species. Genetic divergence measures for western and southern populations of P. hernandesi failed to exceed those of other Phrynosoma species, but the relatively small population size estimated for the northern population causes it to appear as a relatively divergent species. These comparisons underscore the difficulties associated with putting a reference-based approach to species delimitation into practice. Nevertheless, the reference-based approach offers a promising framework for the consistent assessment of biodiversity within clades of organisms with similar life histories and ecological traits.
more »
« less
Consilience Across Multiple, Independent Genomic Data Sets Reveals Species in a Complex with Limited Phenotypic Variation
Abstract Species delimitation in the genomic era has focused predominantly on the application of multiple analytical methodologies to a single massive parallel sequencing (MPS) data set, rather than leveraging the unique but complementary insights provided by different classes of MPS data. In this study, we demonstrate how the use of two independent MPS data sets, a sequence capture data set and a single-nucleotide polymorphism (SNP) data set generated via genotyping-by-sequencing, enables the resolution of species in three complexes belonging to the grass genus Ehrharta, whose strong population structure and subtle morphological variation limit the effectiveness of traditional species delimitation approaches. Sequence capture data are used to construct a comprehensive phylogenetic tree of Ehrharta and to resolve population relationships within the focal clades, while SNP data are used to detect patterns of gene pool sharing across populations, using a novel approach that visualizes multiple values of K. Given that the two genomic data sets are independent, the strong congruence in the clusters they resolve provides powerful ratification of species boundaries in all three complexes studied. Our approach is also able to resolve a number of single-population species and a probable hybrid species, both of which would be difficult to detect and characterize using a single MPS data set. Overall, the data reveal the existence of 11 and five species in the E. setacea and E. rehmannii complexes, with the E. ramosa complex requiring further sampling before species limits are finalized. Despite phenotypic differentiation being generally subtle, true crypsis is limited to just a few species pairs and triplets. We conclude that, in the absence of strong morphological differentiation, the use of multiple, independent genomic data sets is necessary in order to provide the cross-data set corroboration that is foundational to an integrative taxonomic approach. [Species delimitation; genotyping-by-sequencing; population structure; integrative taxonomy; cryptic species; Ehrharta (Poaceae).]
more »
« less
- Award ID(s):
- 1937604
- PAR ID:
- 10479465
- Editor(s):
- Yang, Ya
- Publisher / Repository:
- Oxford Academic
- Date Published:
- Journal Name:
- Systematic Biology
- Volume:
- 72
- Issue:
- 4
- ISSN:
- 1063-5157
- Page Range / eLocation ID:
- 753 to 766
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Lineage-based species definitions applying coalescent approaches to species delimitation have become increasingly popular. Yet, the application of these methods and the recognition of lineage-only definitions have recently been questioned. Species delimitation criteria that explicitly consider both lineages and evidence for ecological role shifts provide an opportunity to incorporate ecologically meaningful data from multiple sources in studies of species boundaries. Here, such criteria were applied to a problematic group of mycoheterotrophic orchids, the Corallorhiza striata complex, analysing genomic, morphological, phenological, reproductive-mode, niche, and fungal host data. A recently developed method for generating genomic polymorphism data-ISSRseq-demonstrates evidence for four distinct lineages, including a previously unidentified lineage in the Coast Ranges and Cascades of California and Oregon, USA. There is divergence in morphology, phenology, reproductive mode, and fungal associates among the four lineages. Integrative analyses, conducted in population assignment and redundancy analysis frameworks, provide evidence of distinct genomic lineages and a similar pattern of divergence in the extended data, albeit with weaker signal. However, none of the extended data sets fully satisfy the condition of a significant role shift, which requires evidence of fixed differences. The four lineages identified in the current study are recognized at the level of variety, short of comprising different species. This study represents the most comprehensive application of lineage + role to date and illustrates the advantages of such an approach.more » « less
-
In the face of anthropogenic change and the potential loss of species, documenting biodiversity – including accurately delimiting species complexes – is of tantamount importance. Genome-wide data are powerful for investigating lineage divergence, though deciding if this divergence represents species-level differentiation remains challenging. Here, we use genome-wide data to investigate species limits in four currently recognized species of Earless Lizards (Phrynosomatidae: Holbrookia), with a focus on H. lacerata and H. subcaudalis, the latter having potentially imperiled populations. This group’s taxonomy has been repeatedly revised; most recently, H. lacerata and H. subcaudalis were elevated to species status using conserved morphological data and a few molecular markers. In this study, we used double-digest restriction-site associated DNA sequencing to delineate species limits for our focal taxa. We recovered five populations that corresponded to five well-supported lineages with very little gene flow among them. Our results support the recognition of H. lacerata and H. subcaudalis as two separate species, based on strong phylogenetic support for these lineages and genetic divergence measures that exceed those of currently recognized species within Holbrookia. Genomic methods for species delimitation offer a promising approach to assess biodiversity in taxonomically confounded taxa or organisms of conservation priority.more » « less
-
ABSTRACT Metagenome-assembled genomes (MAGs) expand our understanding of microbial diversity, evolution, and ecology. Concerns have been raised on how sequencing, assembly, binning, and quality assessment tools may result in MAGs that do not reflect single populations in nature. Here, we reflect on another issue, i.e., how to handle highly similar MAGs assembled from independent data sets. Obtaining multiple genomic representatives for a species is highly valuable, as it allows for population genomic analyses; however, when retaining genomes of closely related populations, it complicates MAG quality assessment and abundance inferences. We show that (i) published data sets contain a large fraction of MAGs sharing >99% average nucleotide identity, (ii) different software packages and parameters used to resolve this redundancy remove very different numbers of MAGs, and (iii) the removal of closely related genomes leads to losses of population-specific auxiliary genes. Finally, we highlight some approaches that can infer strain-specific dynamics across a sample series without dereplication.more » « less
-
Abstract Understanding how genetic diversity is distributed across spatiotemporal scales in species of conservation or management concern is critical for identifying large‐scale mechanisms affecting local conservation status and implementing large‐scale biodiversity monitoring programmes. However, cross‐scale surveys of genetic diversity are often impractical within single studies, and combining datasets to increase spatiotemporal coverage is frequently impeded by using different sets of molecular markers. Recently developed molecular tools make surveys based on standardized single‐nucleotide polymorphism (SNP) panels more feasible than ever, but require existing genomic information. Here, we conduct the first survey of genome‐wide SNPs across the native range of brook trout (Salvelinus fontinalis), a cold‐adapted species that has been the focus of considerable conservation and management effort across eastern North America. Our dataset can be leveraged to easily design SNP panels that allow datasets to be combined for large‐scale analyses. We performed restriction site‐associated DNA sequencing for wild brook trout from 82 locations spanning much of the native range and domestic brook trout from 24 hatchery strains used in stocking efforts. We identified over 24,000 SNPs distributed throughout the brook trout genome. We explored the ability of these SNPs to resolve relationships across spatial scales, including population structure and hatchery admixture. Our dataset captures a wide spectrum of genetic diversity in native brook trout, offering a valuable resource for developing SNP panels. We highlight potential applications of this resource with the goal of increasing the integration of genomic information into decision‐making for brook trout and other species of conservation or management concern.more » « less
An official website of the United States government

