skip to main content


Title: Mathematical constraints on F ST : multiallelic markers in arbitrarily many populations
Interpretations of values of the F ST measure of genetic differentiation rely on an understanding of its mathematical constraints. Previously, it has been shown that F ST values computed from a biallelic locus in a set of multiple populations and F ST values computed from a multiallelic locus in a pair of populations are mathematically constrained as a function of the frequency of the allele that is most frequent across populations. We generalize from these cases to report here the mathematical constraint on F ST given the frequency M of the most frequent allele at a multiallelic locus in a set of multiple populations. Using coalescent simulations of an island model of migration with an infinitely-many-alleles mutation model, we argue that the joint distribution of F ST and M helps in disentangling the separate influences of mutation and migration on F ST . Finally, we show that our results explain a puzzling pattern of microsatellite differentiation: the lower F ST in an interspecific comparison between humans and chimpanzees than in the comparison of chimpanzee populations. We discuss the implications of our results for the use of F ST . This article is part of the theme issue ‘Celebrating 50 years since Lewontin's apportionment of human diversity’.  more » « less
Award ID(s):
2116322
NSF-PAR ID:
10329249
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Philosophical Transactions of the Royal Society B: Biological Sciences
Volume:
377
Issue:
1852
ISSN:
0962-8436
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Discovering local adaptation, its genetic underpinnings, and environmental drivers is important for conserving forest species. Ecological genomic approaches coupled with next‐generation sequencing are useful means to detect local adaptation and uncover its underlying genetic basis in nonmodel species. We report results from a study on flowering dogwood trees (Cornus florida L.) using genotyping by sequencing (GBS). This species is ecologically important to easternUSforests but is severely threatened by fungal diseases. We analyzed subpopulations in divergent ecological habitats within North Carolina to uncover loci under local selection and associated with environmental–functional traits or disease infection. At this scale, we tested the effect of incorporating additional sequencing before scaling for a broader examination of the entire range. To test for biases ofGBS, we sequenced two similarly sampled libraries independently from six populations of three ecological habitats. We obtained environmental–functional traits for each subpopulation to identify associations with genotypes via latent factor mixed modeling (LFMM) and gradient forests analysis. To test whether heterogeneity of abiotic pressures resulted in genetic differentiation indicative of local adaptation, we evaluatedFstper locus while accounting for genetic differentiation between coastal subpopulations and Piedmont‐Mountain subpopulations. Of the 54 candidate loci with sufficient evidence of being under selection among both libraries, 28–39 were Arlequin–BayeScanFstoutliers. ForLFMM, 45 candidates were associated with climate (of 54), 30 were associated with soil properties, and four were associated with plant health. Reanalysis of combined libraries showed that 42 candidate loci still showed evidence of being under selection. We conclude environment‐driven selection on specific loci has resulted in local adaptation in response to potassium deficiencies, temperature, precipitation, and (to a marginal extent) disease. High allele turnover along ecological gradients further supports the adaptive significance of loci speculated to be under selection.

     
    more » « less
  2. Abstract

    Sexual selection must affect the genome for it to have an evolutionary impact, yet signatures of selection remain elusive. Here we use an individual‐based model to investigate the utility of genome‐wide selection components analysis, which compares allele frequencies of individuals at different life history stages within a single population to detect selection without requiring a priori knowledge of traits under selection. We modeled a diploid, sexually reproducing population and introduced strong mate choice on a quantitative trait to simulate sexual selection. Genome‐wide allele frequencies in adults and offspring were compared using weightedFSTvalues. The average number of outlier peaks (i.e., those with significantly largeFSTvalues) with a quantitative trait locus in close proximity (“real” peaks) represented correct diagnoses of loci under selection, whereas peaks above theFSTsignificance threshold without a quantitative trait locus reflected spurious peaks. We found that, even with moderate sample sizes, signatures of strong sexual selection were detectable, but larger sample sizes improved detection rates. The model was better able to detect selection with more neutral markers, and when quantitative trait loci and neutral markers were distributed across multiple chromosomes. Although environmental variation decreased detection rates, the identification of real peaks nevertheless remained feasible. We also found that detection rates can be improved by sampling multiple populations experiencing similar selection regimes. In short, genome‐wide selection components analysis is a challenging but feasible approach for the identification of regions of the genome under selection.

     
    more » « less
  3. Abstract

    Determining how genetic diversity is structured between populations that span the divergence continuum from populations to biological species is key to understanding the generation and maintenance of biodiversity. We investigated genetic divergence and gene flow in eight lineages of birds with a trans‐Beringian distribution, where Asian and North American populations have likely been split and reunited through multiple Pleistocene glacial cycles. Our study transects the speciation process, including eight pairwise comparisons in three orders (ducks, shorebirds and passerines) at population, subspecies and species levels. Using ultraconserved elements (UCEs), we found that these lineages represent conditions from slightly differentiated populations to full biological species. Although allopatric speciation is considered the predominant mode of divergence in birds, all of our best divergence models included gene flow, supporting speciation with gene flow as the predominant mode in Beringia. In our eight lineages, three were best described by a split‐migration model (divergence with gene flow), three best fit a secondary contact scenario (isolation followed by gene flow), and two showed support for both models. The lineages were not evenly distributed across a divergence space defined by gene flow (M) and differentiation (FST), instead forming two discontinuous groups: one with relatively shallow divergence, no fixed single nucleotide polymorphisms (SNPs), and high rates of gene flow between populations; and the second with relatively deeply divergent lineages, multiple fixed SNPs, and low gene flow. Our results highlight the important role that gene flow plays in avian divergence in Beringia.

     
    more » « less
  4. Abstract

    Recurrent mutation produces multiple copies of the same allele which may be co-segregating in a population. Yet, most analyses of allele-frequency or site-frequency spectra assume that all observed copies of an allele trace back to a single mutation. We develop a sampling theory for the number of latent mutations in the ancestry of a rare variant, specifically a variant observed in relatively small count in a large sample. Our results follow from the statistical independence of low-count mutations, which we show to hold for the standard neutral coalescent or diffusion model of population genetics as well as for more general coalescent trees. For populations of constant size, these counts are distributed like the number of alleles in the Ewens sampling formula. We develop a Poisson sampling model for populations of varying size and illustrate it using new results for site-frequency spectra in an exponentially growing population. We apply our model to a large data set of human SNPs and use it to explain dramatic differences in site-frequency spectra across the range of mutation rates in the human genome.

     
    more » « less
  5. Abstract Aim

    Present Amazonian diversity patterns can result from many different mechanisms and, consequently, the factors contributing to divergence across regions and/or taxa may differ. Nevertheless, the river‐barrier hypothesis is still widely invoked as a causal process in divergence of Amazonian species. Here we use model‐based phylogeographic analyses to test the extent to which major Amazonian rivers act similarly as barriers across time and space in two broadly distributed Amazonian taxa.

    Local

    Amazon rain forest.

    Taxon

    The lizardGonatodes humeralis(Sphaerodactylidae) and the tree frogDendropsophus leucophyllatus(Hylidae).

    Methods

    We obtained RADseq data for samples distributed across main river barriers, representing main Areas of Endemism previously proposed for the region. We conduct model‐based phylogeographic and genetic differentiation analyses across each population pair.

    Results

    Measures of genetic differentiation (based onFSTcalculated from genomic data) show that all rivers are associated with significant genetic differentiation. Parameters estimated under investigated divergence models showed that divergence times for populations separated by each of the 11 bordering rivers were all fairly recent. The degree of differentiation consistently varied between taxa and among rivers, which is not an artifact of any corresponding difference in the genetic diversities of the respective taxa, or to amounts of migration based on analyses of the site‐frequency spectrum.

    Main conclusions

    Taken together, our results support a dispersal (rather than vicariance) history, without strong evidence of congruence between these species and rivers. However, once a species crossed a river, populations separated by each and every river have remained isolated—in this sense, rivers act similarly as barriers to any further gene flow. This result suggests differing degrees of persistence and gives rise to the seeming contradiction that the divergence process indeed varies across time, space and species, even though major Amazonian rivers have acted as secondary barriers to gene flow in the focal taxa.

     
    more » « less