skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Selection leads to false inferences of introgression using popular methods
Abstract Detecting introgression between closely related populations or species is a fundamental objective in evolutionary biology. Existing methods for detecting migration and inferring migration rates from population genetic data often assume a neutral model of evolution. Growing evidence of the pervasive impact of selection on large portions of the genome across diverse taxa suggests that this assumption is unrealistic in most empirical systems. Further, ignoring selection has previously been shown to negatively impact demographic inferences (e.g. of population size histories). However, the impacts of biologically realistic selection on inferences of migration remain poorly explored. Here, we simulate data under models of background selection, selective sweeps, balancing selection, and adaptive introgression. We show that ignoring selection sometimes leads to false inferences of migration in popularly used methods that rely on the site frequency spectrum. Specifically, balancing selection and some models of background selection result in the rejection of isolation-only models in favor of isolation-with-migration models and lead to elevated estimates of migration rates. BPP, a method that analyzes sequence data directly, showed false positives for all conditions at recent divergence times, but balancing selection also led to false positives at medium-divergence times. Our results suggest that such methods may be unreliable in some empirical systems, such that new methods that are robust to selection need to be developed.  more » « less
Award ID(s):
2146866
PAR ID:
10581498
Author(s) / Creator(s):
;
Editor(s):
Ralph, P
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
GENETICS
Volume:
227
Issue:
4
ISSN:
1943-2631
Page Range / eLocation ID:
iyae089
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract When organisms experience secondary contact after allopatric divergence, genomic regions can introgress differentially depending on their relationships with adaptation, reproductive isolation, recombination, and drift. Analyses of genome‐wide patterns of divergence and introgression could provide insight into the outcomes of hybridization and the potential relationship between allopatric divergence and reproductive isolation. Here, we generate population genetic data (26,262 SNPs; 353 individuals) using a reduced‐representation sequencing approach to quantify patterns of ancestry, differentiation, and introgression between a pair of ecologically distinct mammals—the desert woodrat (N.lepida) and Bryant's woodrat (N.bryanti)—that hybridize at a sharp ecotone in southern California. Individual ancestry estimates confirmed that hybrids were rare in this bimodal hybrid zone, and entirely consisted of a few F1individuals and a broad range of multigenerational backcrosses. Genomic cline analyses indicated more than half of loci had elevated introgression from one genomic background into the other. However, introgression was not associated with relative or absolute measures of divergence, and loci with extreme values for both were not typically found near detoxification enzymes previously implicated in dietary specialization for woodrats. The decoupling of differentiation and introgression suggests that processes other than adaptation, such as drift, may underlie the extreme clines at this contact zone. 
    more » « less
  2. Abstract Can knowledge about genome architecture inform biogeographic and phylogenetic inference? Selection, drift, recombination, and gene flow interact to produce a genomic landscape of divergence wherein patterns of differentiation and genealogy vary nonrandomly across the genomes of diverging populations. For instance, genealogical patterns that arise due to gene flow should be more likely to occur on smaller chromosomes, which experience high recombination, whereas those tracking histories of geographic isolation (reduced gene flow caused by a barrier) and divergence should be more likely to occur on larger and sex chromosomes. In Amazonia, populations of many bird species diverge and introgress across rivers, resulting in reticulated genomic signals. Herein, we used reduced representation genomic data to disentangle the evolutionary history of 4 populations of an Amazonian antbird, Thamnophilus aethiops, whose biogeographic history was associated with the dynamic evolution of the Madeira River Basin. Specifically, we evaluate whether a large river capture event ca. 200 Ka, gave rise to reticulated genealogies in the genome by making spatially explicit predictions about isolation and gene flow based on knowledge about genomic processes. We first estimated chromosome-level phylogenies and recovered 2 primary topologies across the genome. The first topology (T1) was most consistent with predictions about population divergence and was recovered for the Z-chromosome. The second (T2), was consistent with predictions about gene flow upon secondary contact. To evaluate support for these topologies, we trained a convolutional neural network to classify our data into alternative diversification models and estimate demographic parameters. The best-fit model was concordant with T1 and included gene flow between non-sister taxa. Finally, we modeled levels of divergence and introgression as functions of chromosome length and found that smaller chromosomes experienced higher gene flow. Given that (1) genetrees supporting T2 were more likely to occur on smaller chromosomes and (2) we found lower levels of introgression on larger chromosomes (and especially the Z-chromosome), we argue that T1 represents the history of population divergence across rivers and T2 the history of secondary contact due to barrier loss. Our results suggest that a significant portion of genomic heterogeneity arises due to extrinsic biogeographic processes such as river capture interacting with intrinsic processes associated with genome architecture. Future phylogeographic studies would benefit from accounting for genomic processes, as different parts of the genome reveal contrasting, albeit complementary histories, all of which are relevant for disentangling the intricate geogenomic mechanisms of biotic diversification. [Amazonia; biogeography; demographic modeling; gene flow; gene tree; genome architecture; geogenomics; introgression; linked selection; neural network; phylogenomic; phylogeography; reproductive isolation; speciation; species tree.] 
    more » « less
  3. ABSTRACT Coalescent modelling of hybrid zones can provide novel insights into the historical demography of populations, including divergence times, population sizes, introgression proportions, migration rates and the timing of hybrid zone formation. We used coalescent analysis to determine whether the hybrid zone between phylogeographic lineages of the Plateau Fence Lizard (Sceloporus tristichus) in Arizona formed recently due to human‐induced landscape changes, or if it originated during Pleistocene climatic shifts. Given the presence of mitochondrial DNA from another species in the hybrid zone (Southwestern Fence Lizard,S. cowlesi), we tested for the presence ofS. cowlesinuclear DNA in the hybrid zone as well as reassessed the species boundary betweenS. tristichusandS. cowlesi. No evidence ofS. cowlesinuclear DNA is found in the hybrid zone, and the paraphyly of both species raises concerns about their taxonomic validity. Introgression analysis placed the divergence time between the parental hybrid zone populations at approximately 140 kya and their secondary contact and hybridization at approximately 11 kya at the end of the Pleistocene. Introgression proportions estimated for hybrid populations are correlated with their geographic distance from parental populations. The multispecies coalescent with migration provided significant support for unidirectional migration moving from south to north, which is consistent with spatial cline analyses that suggest a slow but steady northward shift of the centre of the hybrid zone over the last two decades. When analysing hybrid populations sampled along a linear transect, coalescent methods can provide novel insights into hybrid zone dynamics. 
    more » « less
  4. Abstract BackgroundGenome-wide association studies (GWAS) seek to identify single nucleotide polymorphisms (SNPs) that cause observed phenotypes. However, with highly correlated SNPs, correlated observations, and the number of SNPs being two orders of magnitude larger than the number of observations, GWAS procedures often suffer from high false positive rates. ResultsWe propose BGWAS, a novel Bayesian variable selection method based on nonlocal priors for linear mixed models specifically tailored for genome-wide association studies. Our proposed method BGWAS uses a novel nonlocal prior for linear mixed models (LMMs). BGWAS has two steps: screening and model selection. The screening step scans through all the SNPs fitting one LMM for each SNP and then uses Bayesian false discovery control to select a set of candidate SNPs. After that, a model selection step searches through the space of LMMs that may have any number of SNPs from the candidate set. A simulation study shows that, when compared to popular GWAS procedures, BGWAS greatly reduces false positives while maintaining the same ability to detect true positive SNPs. We show the utility and flexibility of BGWAS with two case studies: a case study on salt stress in plants, and a case study on alcohol use disorder. ConclusionsBGWAS maintains and in some cases increases the recall of true SNPs while drastically lowering the number of false positives compared to popular SMA procedures. 
    more » « less
  5. Abstract The integration of ecological niche modelling into phylogeographic analyses has allowed for the identification and testing of potential refugia under a hypothesis‐based framework, where the expected patterns of higher genetic diversity in refugial populations and evidence of range expansion of nonrefugial populations are corroborated with empirical data. In this study, we focus on a montane‐restricted cryophilic harvestman,Sclerobunus robustus, distributed throughout the heterogeneous Southern Rocky Mountains and Intermontane Plateau of southwestern North America. We identified hypothetical refugia using ecological niche models (ENMs) across three time periods, corroborated these refugia with population genetic methods using double‐digest RAD‐seq data and conducted population‐level phylogenetic and divergence dating analyses. ENMs identify two large temporally persistent regions in the mid‐latitude highlands. Genetic patterns support these two hypothesized refugia with higher genetic diversity within refugial populations and evidence for range expansion in populations found outside hypothesized refugia. Phylogenetic analyses identify five to six genetically divergent, geographically cohesive clades ofS. robustus. Divergence dating analyses suggest that these separate refugia date to the Pliocene and that divergence between clades pre‐dates the late Pleistocene glacial cycles, while diversification within clades was likely driven by these cycles. Population genetic analyses reveal effects of both isolation by distance (IBD) and isolation by environment (IBE), with IBD more important in the continuous mountainous portion of the distribution, while IBE was stronger in the populations inhabiting the isolated sky islands of the south. Using model‐based coalescent approaches, we find support for postdivergence migration between clades from separate refugia. 
    more » « less