skip to main content


Title: Spatial structure alters the site frequency spectrum produced by hitchhiking
Abstract

The reduction of genetic diversity due to genetic hitchhiking is widely used to find past selective sweeps from sequencing data, but very little is known about how spatial structure affects hitchhiking. We use mathematical modeling and simulations to find the unfolded site frequency spectrum left by hitchhiking in the genomic region of a sweep in a population occupying a 1D range. For such populations, sweeps spread as Fisher waves, rather than logistically. We find that this leaves a characteristic 3-part site frequency spectrum at loci very close to the swept locus. Very low frequencies are dominated by recent mutations that occurred after the sweep and are unaffected by hitchhiking. At moderately low frequencies, there is a transition zone primarily composed of alleles that briefly “surfed” on the wave of the sweep before falling out of the wavefront, leaving a spectrum close to that expected in well-mixed populations. However, for moderate-to-high frequencies, there is a distinctive scaling regime of the site frequency spectrum produced by alleles that drifted to fixation in the wavefront and then were carried throughout the population. For loci slightly farther away from the swept locus on the genome, recombination is much more effective at restoring diversity in 1D populations than it is in well-mixed ones. We find that these signatures of space can be strong even in apparently well-mixed populations with negligible spatial genetic differentiation, suggesting that spatial structure may frequently distort the signatures of hitchhiking in natural populations.

 
more » « less
Award ID(s):
2146260 1914916
NSF-PAR ID:
10378543
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Genetics
Volume:
222
Issue:
3
ISSN:
1943-2631
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Kim, Yuseob (Ed.)
    Abstract Selective sweeps are frequent and varied signatures in the genomes of natural populations, and detecting them is consequently important in understanding mechanisms of adaptation by natural selection. Following a selective sweep, haplotypic diversity surrounding the site under selection decreases, and this deviation from the background pattern of variation can be applied to identify sweeps. Multiple methods exist to locate selective sweeps in the genome from haplotype data, but none leverages the power of a model-based approach to make their inference. Here, we propose a likelihood ratio test statistic T to probe whole-genome polymorphism data sets for selective sweep signatures. Our framework uses a simple but powerful model of haplotype frequency spectrum distortion to find sweeps and additionally make an inference on the number of presently sweeping haplotypes in a population. We found that the T statistic is suitable for detecting both hard and soft sweeps across a variety of demographic models, selection strengths, and ages of the beneficial allele. Accordingly, we applied the T statistic to variant calls from European and sub-Saharan African human populations, yielding primarily literature-supported candidates, including LCT, RSPH3, and ZNF211 in CEU, SYT1, RGS18, and NNT in YRI, and HLA genes in both populations. We also searched for sweep signatures in Drosophila melanogaster, finding expected candidates at Ace, Uhg1, and Pimet. Finally, we provide open-source software to compute the T statistic and the inferred number of presently sweeping haplotypes from whole-genome data. 
    more » « less
  2. Positive selection causes beneficial alleles to rise to high frequency, resulting in a selective sweep of the diversity surrounding the selected sites. Accordingly, the signature of a selective sweep in an ancestral population may still remain in its descendants. Identifying signatures of selection in the ancestor that are shared among its descendants is important to contextualize the timing of a sweep, but few methods exist for this purpose. We introduce the statistic SS-H12, which can identify genomic regions under shared positive selection across populations and is based on the theory of the expected haplotype homozygosity statistic H12, which detects recent hard and soft sweeps from the presence of high-frequency haplotypes. SS-H12 is distinct from comparable statistics because it requires a minimum of only two populations, and properly identifies and differentiates between independent convergent sweeps and true ancestral sweeps, with high power and robustness to a variety of demographic models. Furthermore, we can apply SS-H12 in conjunction with the ratio of statistics we term Embedded Image and Embedded Image to further classify identified shared sweeps as hard or soft. Finally, we identified both previously reported and novel shared sweep candidates from human whole-genome sequences. Previously reported candidates include the well-characterized ancestral sweeps at LCT and SLC24A5 in Indo-Europeans, as well as GPHN worldwide. Novel candidates include an ancestral sweep at RGS18 in sub-Saharan Africans involved in regulating the platelet response and implicated in sudden cardiac death, and a convergent sweep at C2CD5 between European and East Asian populations that may explain their different insulin responses. 
    more » « less
  3. Abstract

    Rapid evolution of advantageous traits following abrupt environmental change can help populations recover from demographic decline. However, for many introduced diseases affecting longer‐lived, slower reproducing hosts, mortality is likely to outpace the acquisition of adaptive de novo mutations. Adaptive alleles must therefore be selected from standing genetic variation, a process that leaves few detectable genomic signatures. Here, we present whole genome evidence for selection in bat populations that are recovering from white‐nose syndrome (WNS). We collected samples both during and after a WNS‐induced mass mortality event in two little brown bat populations that are beginning to show signs of recovery and found signatures of soft sweeps from standing genetic variation at multiple loci throughout the genome. We identified one locus putatively under selection in a gene associated with the immune system. Multiple loci putatively under selection were located within genes previously linked to host response to WNS as well as to changes in metabolism during hibernation. Results from two additional populations suggested that loci under selection may differ somewhat among populations. Through these findings, we suggest that WNS‐induced selection may contribute to genetic resistance in this slowly reproducing species threatened with extinction.

     
    more » « less
  4. Abstract

    The paleback darter,Etheostoma pallididorsum, is considered imperilled and has recently been petitioned for listing under the Endangered Species Act. Previous allozyme‐based studies found evidence of a small effective population size, warranting conservation concern. The objective of this study was to assess the population dynamics and the phylogeographical history of the paleback darter, using a multilocus microsatellite approach and mitochondrial DNA.

    The predictions of this study were that: paleback darter populations will exhibit low genetic diversity and minimal gene flow; population structure will correspond to the river systems from which the samples are derived; reservoir dams impounding the reaches between the Caddo and Ouachita rivers would serve as effective barriers to gene flow; and the Caddo and Ouachita rivers are reciprocally monophyletic.

    Microsatellite DNA loci revealed significant structure among sampled localities (globalFst= 0.17,P< 0.001), with evidence of two distinct populations representing the Caddo and Ouachita rivers. However, Bayesian phylogeographical analyses resulted in three distinct clades: Caddo River, Ouachita River, and Mazarn Creek. Divergence from the most recent ancestor shared among the river drainages was estimated at 60 Kya. Population genetic diversity was relatively low (He= 0.65; mean alleles per locus,A= 6.26), but was comparable with the population genetic diversity found in the close relatives slackwater darter,Etheostoma boschungi(He= 0.65;A= 6.74), and Tuscumbia darter,Etheostoma tuscumbia(He= 0.57;A= 5.53).

    These results have conservation implications for paleback darter populations and can be informative for other headwater specialist species. Like other headwater species with population structuring and relatively low genetic diversity, the persistence of paleback darter populations is likely to be tied to the persistence and connectivity of local breeding and non‐breeding habitat. These results do not raise conservation concern for a population decline; however, the restricted distribution and endemic status of the species still renders paleback darter populations vulnerable to extirpation or extinction.

     
    more » « less
  5. Evolution by natural selection may be effective enough to allow for recurrent, rapid adaptation to distinct niche environments within a well-mixed population. For this to occur, selection must act on standing genetic variation such that mortality i.e. genetic load, is minimized while polymorphism is maintained. Selection on multiple, redundant loci of small effect provides a potentially inexpensive solution. Yet, demonstrating adaptation via redundant, polygenic selection in the wild remains extremely challenging because low per-locus effect sizes and high genetic redundancy severely reduce statistical power. One approach to facilitate identification of loci underlying polygenic selection is to harness natural replicate populations experiencing similar selection pressures that harbor high within-, yet negligible among-population genetic variation. Such populations can be found among the teleost Fundulus heteroclitus. F. heteroclitus inhabits salt marsh estuaries that are characterized by high environmental heterogeneity e.g. tidal ponds, creeks, coastal basins. Here, we sample four of these heterogeneous niches (one coastal basin and three replicate tidal ponds) at two time points from among a single, panmictic F. heteroclitus population. We identify 10,861 single nucleotide polymorphisms using a genotyping-by-sequencing approach and quantify temporal allele frequency change within, as well as spatial divergence among subpopulations residing in these niches. We find a significantly elevated number of concordant allele frequency changes among all subpopulations, suggesting ecosystem-wide adaptation to a common selection pressure. Remarkably, we also find an unexpected number of temporal allele frequency changes that generate fine-scale divergence among subpopulations, suggestive of local adaptation to distinct niche environments. Both patterns are characterized by a lack of large-effect loci yet an elevated total number of significant loci. Adaptation via redundant, polygenic selection offers a likely explanation for these patterns as well as a potential mechanism for polymorphism maintenance in the F. heteroclitus system. 
    more » « less