Hybridization events complicate the accurate reconstruction of phylogenies, as they lead to patterns of genetic heritability that are unexpected under traditional, bifurcating models of species trees. This phenomenon has led to the development of methods to infer these varied hybridization events, both methods that reconstruct networks directly, as well as summary methods that predict individual hybridization events from a subset of taxa. However, a lack of empirical comparisons between methods – especially those pertaining to large networks with varied hybridization scenarios – hinders their practical use. Here, we provide a comprehensive review of popular summary methods: TICR, MSCquartets, HyDe, Patterson’s D-Statistic (ABBA-BABA), D3, and Dp. TICR and MSCquartets are based on quartet concordance factors gathered from gene tree topologies and HyDe, Patterson’s D-Statistic, D3, and Dp use site pattern frequencies to identify hybridization events between sets of three taxa. We then use simulated data to address questions of method accuracy and ideal use scenarios by testing methods against complex networks which depict gene flow events that differ in depth (timing), quantity (single vs. multiple, overlapping hybridizations), and rate of gene flow (γ). We find that deeper or multiple hybridization events may introduce noise and weaken the signal of hybridization, leading to higher relative false negative rates across all methods. Despite some forms of hybridization eluding quartet-based detection methods, MSCquartets displays high precision in most scenarios. While HyDe results in high false negative rates when tested on hybridizations involving extinct or unsampled ghost lineages, HyDe is the only method able to identify the direction of hybridization, distinguishing the source parental lineages from recipient hybrid lineages. Lastly, we test the methods on a dataset of ultraconserved elements from the bee subfamily Nomiinae, finding possible hybridization events between clades which correspond to regions of poor support in the species tree estimated in a previous study.
more »
« less
Summary Tests of Introgression Are Highly Sensitive to Rate Variation Across Lineages
Abstract The evolutionary implications and frequency of hybridization and introgression are increasingly being recognized across the tree of life. To detect hybridization from multi-locus and genome-wide sequence data, a popular class of methods are based on summary statistics from subsets of 3 or 4 taxa. However, these methods often carry the assumption of a constant substitution rate across lineages and genes, which is commonly violated in many groups. In this work, we quantify the effects of rate variation on the D test (also known as ABBA–BABA test), the D3 test, and HyDe. All 3 tests are used widely across a range of taxonomic groups, in part because they are very fast to compute. We consider rate variation across species lineages, across genes, their lineage-by-gene interaction, and rate variation across gene-tree edges. We simulated species networks according to a birth–death-hybridization process, so as to capture a range of realistic species phylogenies. For all 3 methods tested, we found a marked increase in the false discovery of reticulation (type-1 error rate) when there is rate variation across species lineages. The D3 test was the most sensitive, with around 80% type-1 error, such that D3 appears to more sensitive to a departure from the clock than to the presence of reticulation. For all 3 tests, the power to detect hybridization events decreased as the number of hybridization events increased, indicating that multiple hybridization events can obscure one another if they occur within a small subset of taxa. Our study highlights the need to consider rate variation when using site-based summary statistics, and points to the advantages of methods that do not require assumptions on evolutionary rates across lineages or across genes.
more »
« less
- PAR ID:
- 10480420
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Systematic Biology
- Volume:
- 72
- Issue:
- 6
- ISSN:
- 1063-5157
- Format(s):
- Medium: X Size: p. 1357-1369
- Size(s):
- p. 1357-1369
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract A fundamental assumption of evolutionary biology is that phylogeny follows a bifurcating process. However, hybrid speciation and introgression are becoming more widely documented in many groups. Hybrid inference studies have been historically limited to small sets of taxa, while exploration of the prevalence and trends of reticulation at deep time scales remains unexplored. We study the evolutionary history of an adaptive radiation of 109 gemsnakes in Madagascar (Pseudoxyrhophiinae) to identify potential instances of introgression. Using several network inference methods, we find 12 reticulation events within the 22-million-year evolutionary history of gemsnakes, producing 28% of the diversity for the group, including one reticulation that resulted in the diversification of an 18 species radiation. These reticulations are found at nodes with high gene tree discordance and occurred among parental lineages distributed along a north-south axis that share similar ecologies. Younger hybrids occupy intermediate contact zones between the parent lineages showing that post-speciation dispersal in this group has not eroded the spatial signatures of introgression. Reticulations accumulated consistently over time, despite drops in overall speciation rates during the Pleistocene. This suggests that while bifurcating speciation rates may decline as the result of species accumulation and environmental change, speciation by hybridization may be more robust to these processes.more » « less
-
The development of statistical methods to infer species phylogenies with reticulations (species networks) has led to many discoveries of gene flow between distinct species. These methods typically assume only incomplete lineage sorting and introgression. Given that phylogenetic networks can be arbitrarily complex, these methods might compensate for model misspecification by increasing the number of dimensions beyond the true value. Herein, we explore the effect of potential model misspecification, including the negligence of gene tree estimation error (GTEE) and assumption of a single substitution rate for all genomic loci, on the accuracy of phylogenetic network inference using both simulated and biological data. In particular, we assess the accuracy of estimated phylogenetic networks as well as test statistics for determining whether a network is the correct evolutionary history, as opposed to the simpler model that is a tree.We found that while GTEE negatively impacts the performance of test statistics to determine the “treeness” of the evolutionary history of a data set, running those tests on triplets of taxa and correcting for multiple-testing significantly ameliorates the problem. We also found that accounting for substitution rate heterogeneity improves the reliability of full Bayesian inference methods of phylogenetic networks, whereas summary statistic methods are robust to GTEE and rate heterogeneity, though currently require manual inspection to determine the network complexity.more » « less
-
Ancient hybridization leads to the repeated evolution of red flowers across a monkeyflower radiationAbstract The reuse of old genetic variation can promote rapid diversification in evolutionary radiations, but in most cases, the historical events underlying this divergence are not known. For example, ancient hybridization can generate new combinations of alleles that sort into descendant lineages, potentially providing the raw material to initiate divergence. In the Mimulus aurantiacus species complex, there is evidence for widespread gene flow among members of this radiation. In addition, allelic variation in the MaMyb2 gene is responsible for differences in flower color between the closely related ecotypes of subspecies puniceus, contributing to reproductive isolation by pollinators. Previous work suggested that MaMyb2 was introgressed into the red-flowered ecotype of puniceus. However, additional taxa within the radiation have independently evolved red flowers from their yellow-flowered ancestors, raising the possibility that this introgression had a more ancient origin. In this study, we used repeated tests of admixture from whole-genome sequence data across this diverse radiation to demonstrate that there has been both ancient and recurrent hybridization in this group. However, most of the signal of this ancient introgression has been removed due to selection, suggesting that widespread barriers to gene flow are in place between taxa. Yet, a roughly 30 kb region that contains the MaMyb2 gene is currently shared only among the red-flowered taxa. Patterns of admixture, sequence divergence, and extended haplotype homozygosity across this region confirm a history of ancient hybridization, where functional variants have been preserved due to positive selection in red-flowered taxa but lost in their yellow-flowered counterparts. The results of this study reveal that selection against gene flow can reduce genomic signatures of ancient hybridization, but that historical introgression can provide essential genetic variation that facilitates the repeated evolution of phenotypic traits between lineages.more » « less
-
Abstract Butterfly eyes are complex organs that are composed of a diversity of proteins and they play a central role in visual signaling and ultimately, speciation, and adaptation. Here, we utilized the whole eye transcriptome to obtain a more holistic view of the evolution of the butterfly eye while accounting for speciation events that co-occur with ancient hybridization. We sequenced and assembled transcriptomes from adult female eyes of eight species representing all major clades of the Heliconius genus and an additional outgroup species, Dryas iulia. We identified 4,042 orthologous genes shared across all transcriptome data sets and constructed a transcriptome-wide phylogeny, which revealed topological discordance with the mitochondrial phylogenetic tree in the Heliconius pupal mating clade. We then estimated introgression among lineages using additional genome data and found evidence for ancient hybridization leading to the common ancestor of Heliconius hortense and Heliconius clysonymus. We estimated the Ka/Ks ratio for each orthologous cluster and performed further tests to demonstrate genes showing evidence of adaptive protein evolution. Furthermore, we characterized patterns of expression for a subset of these positively selected orthologs using qRT-PCR. Taken together, we identified candidate eye genes that show signatures of adaptive molecular evolution and provide evidence of their expression divergence between species, tissues, and sexes. Our results demonstrate: 1) greater evolutionary changes in younger Heliconius lineages, that is, more positively selected genes in the cydno–melpomene–hecale group as opposed to the sara–hortense–erato group, and 2) suggest an ancient hybridization leading to speciation among Heliconius pupal-mating species.more » « less
An official website of the United States government
