Recent research shows that introgression between closely-related species is an important source of adaptive alleles for a wide range of taxa. Typically, detection of adaptive introgression from genomic data relies on comparative analyses that require sequence data from both the recipient and the donor species. However, in many cases, the donor is unknown or the data is not currently available. Here, we introduce a genome-scan method—VolcanoFinder—to detect recent events of adaptive introgression using polymorphism data from the recipient species only. VolcanoFinder detects adaptive introgression sweeps from the pattern of excess intermediate-frequency polymorphism they produce in the flanking region of the genome, a pattern which appears as a volcano-shape in pairwise genetic diversity. Using coalescent theory, we derive analytical predictions for these patterns. Based on these results, we develop a composite-likelihood test to detect signatures of adaptive introgression relative to the genomic background. Simulation results show that VolcanoFinder has high statistical power to detect these signatures, even for older sweeps and for soft sweeps initiated by multiple migrant haplotypes. Finally, we implement VolcanoFinder to detect archaic introgression in European and sub-Saharan African human populations, and uncovered interesting candidates in both populations, such as TSHR in Europeans and TCHH-RPTN in Africans. We discuss their biological implications and provide guidelines for identifying and circumventing artifactual signals during empirical applications of VolcanoFinder.
more »
« less
Phylogenomic approaches to detecting and characterizing introgression
Abstract Phylogenomics has revealed the remarkable frequency with which introgression occurs across the tree of life. These discoveries have been enabled by the rapid growth of methods designed to detect and characterize introgression from whole-genome sequencing data. A large class of phylogenomic methods makes use of data across species to infer and characterize introgression based on expectations from the multispecies coalescent. These methods range from simple tests, such as the D-statistic, to model-based approaches for inferring phylogenetic networks. Here, we provide a detailed overview of the various signals that different modes of introgression are expected leave in the genome, and how current methods are designed to detect them. We discuss the strengths and pitfalls of these approaches and identify areas for future development, highlighting the different signals of introgression, and the power of each method to detect them. We conclude with a discussion of current challenges in inferring introgression and how they could potentially be addressed.
more »
« less
- Award ID(s):
- 1936187
- PAR ID:
- 10362681
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Genetics
- Volume:
- 220
- Issue:
- 2
- ISSN:
- 1943-2631
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Smith, Stephen (Ed.)Abstract Understanding how gene flow affects population divergence and speciation remains challenging. Differentiating one evolutionary process from another can be difficult because multiple processes can produce similar patterns, and more than one process can occur simultaneously. Although simple population models produce predictable results, how these processes balance in taxa with patchy distributions and complicated natural histories is less certain. These types of populations might be highly connected through migration (gene flow), but can experience stronger effects of genetic drift and inbreeding, or localized selection. Although different signals can be difficult to separate, the application of high-throughput sequence data can provide the resolution necessary to distinguish many of these processes. We present whole-genome sequence data for an avian species group with an alpine and arctic tundra distribution to examine the role that different population genetic processes have played in their evolutionary history. Rosy-finches inhabit high elevation mountaintop sky islands and high-latitude island and continental tundra. They exhibit extensive plumage variation coupled with low levels of genetic variation. Additionally, the number of species within the complex is debated, making them excellent for studying the forces involved in the process of diversification, as well as an important species group in which to investigate species boundaries. Total genomic variation suggests a broadly continuous pattern of allele frequency changes across the mainland taxa of this group in North America. However, phylogenomic analyses recover multiple distinct, well supported, groups that coincide with previously described morphological variation and current species-level taxonomy. Tests of introgression using D-statistics and approximate Bayesian computation reveal significant levels of introgression between multiple North American taxa. These results provide insight into the balance between divergent and homogenizing population genetic processes and highlight remaining challenges in interpreting conflict between different types of analytical approaches with whole-genome sequence data. [ABBA-BABA; approximate Bayesian computation; gene flow; phylogenomics; speciation; whole-genome sequencing.]more » « less
-
Zhu, Xiaofeng (Ed.)Introgression is a common evolutionary phenomenon that results in shared genetic material across non-sister taxa. Existing statistical methods such as Patterson’sDstatistic can detect introgression by measuring an excess of shared derived alleles between populations. TheDstatistic is effective to detect genome-wide patterns of introgression but can give spurious inferences of introgression when applied to local regions. We propose a new statistic,D+, that leverages both shared ancestral and derived alleles to infer local introgressed regions. Incorporating both shared derived and ancestral alleles increases the number of informative sites per region, improving our ability to identify local introgression. We use a coalescent framework to derive the expected value of this statistic as a function of different demographic parameters under an instantaneous admixture model and use coalescent simulations to compute the power and precision ofD+. While the power ofDandD+is comparable,D+has better precision thanD. We applyD+to empirical data from the 1000 Genome Project andHeliconiusbutterflies to infer local targets of introgression in humans and in butterflies.more » « less
-
Abstract Rapidly evolving taxa are excellent models for understanding the mechanisms that give rise to biodiversity. However, developing an accurate historical framework for comparative analysis of such lineages remains a challenge due to ubiquitous incomplete lineage sorting (ILS) and introgression. Here, we use a whole-genome alignment, multiple locus-sampling strategies, and summary-tree and single nucleotide polymorphism-based species-tree methods to infer a species tree for eastern North American Neodiprion species, a clade of pine-feeding sawflies (Order: Hymenopteran; Family: Diprionidae). We recovered a well-supported species tree that—except for three uncertain relationships—was robust to different strategies for analyzing whole-genome data. Nevertheless, underlying gene-tree discordance was high. To understand this genealogical variation, we used multiple linear regression to model site concordance factors estimated in 50-kb windows as a function of several genomic predictor variables. We found that site concordance factors tended to be higher in regions of the genome with more parsimony-informative sites, fewer singletons, less missing data, lower GC content, more genes, lower recombination rates, and lower D-statistics (less introgression). Together, these results suggest that ILS, introgression, and genotyping error all shape the genomic landscape of gene-tree discordance in Neodiprion. More generally, our findings demonstrate how combining phylogenomic analysis with knowledge of local genomic features can reveal mechanisms that produce topological heterogeneity across genomes.more » « less
-
ABSTRACT Hybridisation is a common feature of evolutionary radiations, but its genomic consequences vary depending on when it occurs. Since reproductive isolation takes time to accumulate, hybridisation can occur at multiple points during divergence. Previous studies suggested that the taxonomic diversity in evolutionary radiations can help infer the timing of past gene flow events. Here, we assess the power of these approaches for revealing when gene flow occurred between two monkeyflower taxa (Mimulus aurantiacus) endemic to the Channel Islands of California. Coalescent simulations reveal that conventional four‐taxon tests may not be capable of fully distinguishing between recent and ancient introgression, but genome‐wide patterns of phylogenetic discordance vary predictably with different histories of hybridisation. Using whole‐genome sequencing and phylogenetic tests for introgression across theM. aurantiacusradiation, we identify signals of both ancient and recent hybridisation that occurred between the island taxa and their ancestors. In addition, we find widespread selection against introgressed ancestry, consistent with polygenic barriers to gene flow. However, we also identify localised signals across the genome that may indicate adaptive introgression. This study highlights the power and challenges of trying to disentangle complex histories of hybridisation. More broadly, our results illustrate the multiple roles that gene flow can play in evolutionary radiations: hybridisation can expose genetic incompatibilities that contribute to reproductive isolation while also likely facilitating adaptation by transferring beneficial alleles between taxa. These findings underscore the dynamic interplay between the timing of hybridisation and natural selection in shaping evolutionary trajectories within radiations.more » « less
An official website of the United States government
