Allele-sharing statistics for a genetic locus measure the dissimilarity between two populations as a mean of the dissimilarity between random pairs of individuals, one from each population. Owing to within-population variation in genotype, allele-sharing dissimilarities can have the property that they have a nonzero value when computed between a population and itself. We consider the mathematical properties of allele-sharing dissimilarities in a pair of populations, treating the allele frequencies in the two populations parametrically. Examining two formulations of allele-sharing dissimilarity, we obtain the distributions of within-population and between-population dissimilarities for pairs of individuals. We then mathematically explore the scenarios in which, for certain allele-frequency distributions, the within-population dissimilarity – the mean dissimilarity between randomly chosen members of a population – can exceed the dissimilarity between two populations. Such scenarios assist in explaining observations in population-genetic data that members of a population can be empirically more genetically dissimilar from each other on average than they are from members of another population. For a population pair, however, the mathematical analysis finds that at least one of the two populations always possesses smaller within-population dissimilarity than the value of the between-population dissimilarity. We illustrate the mathematical results with an application to human population-genetic data.
In studying allele-frequency variation across populations, it is often convenient to classify an allelic type as “rare,” with nonzero frequency less than or equal to a specified threshold, “common,” with a frequency above the threshold, or entirely unobserved in a population. When sample sizes differ across populations, however, especially if the threshold separating “rare” and “common” corresponds to a small number of observed copies of an allelic type, discreteness effects can lead a sample from one population to possess substantially more rare allelic types than a sample from another population, even if the two populations have extremely similar underlying allele-frequency distributions across loci. We introduce a rarefaction-based sample-size correction for use in comparing rare and common variation across multiple populations whose sample sizes potentially differ. We use our approach to examine rare and common variation in worldwide human populations, finding that the sample-size correction introduces subtle differences relative to analyses that use the full available sample sizes. We introduce several ways in which the rarefaction approach can be applied: we explore the dependence of allele classifications on subsample sizes, we permit more than two classes of allelic types of nonzero frequency, and we analyze rare and common variation in sliding windows along the genome. The results can assist in clarifying similarities and differences in allele-frequency patterns across populations.
more » « less- Award ID(s):
- 2116322
- PAR ID:
- 10416153
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- GENETICS
- Volume:
- 224
- Issue:
- 2
- ISSN:
- 1943-2631
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract -
The ways in which genetic variation is distributed within and among populations is a key determinant of the evolutionary features of a species. However, most comprehensive studies of these features have been restricted to studies of subdivision in settings known to have been driven by local adaptation, leaving our understanding of the natural dispersion of allelic variation less than ideal. Here, we present a geographic population-genomic analysis of 10 populations of the freshwater microcrustacean Daphnia pulex, an emerging model system in evolutionary genomics. These populations exhibit a pattern of moderate isolation-by-distance, with an average migration rate of 0.6 individuals per generation, and average effective population sizes of ∼650,000 individuals. Most populations contain numerous private alleles, and genomic scans highlight the presence of islands of excessively high population subdivision for more common alleles. A large fraction of such islands of population divergence likely reflect historical neutral changes, including rare stochastic migration and hybridization events. The data do point to local adaptive divergence, although the precise nature of the relevant variation is diffuse and cannot be associated with particular loci, despite the very large sample sizes involved in this study. In contrast, an analysis of between-species divergence highlights positive selection operating on a large set of genes with functions nearly nonoverlapping with those involved in local adaptation, in particular ribosome structure, mitochondrial bioenergetics, light reception and response, detoxification, and gene regulation. These results set the stage for using D. pulex as a model for understanding the relationship between molecular and cellular evolution in the context of natural environments.more » « less
-
Abstract Microsatellites are common in genomes of most eukaryotic species. Due to their high mutability, an adaptive role for microsatellites has been considered. However, little is known concerning the contribution of microsatellites towards phenotypic variation. We used populations of the common sunflower (
Helianthus annuus ) at two latitudes to quantify the effect of microsatellite allele length on phenotype at the level of gene expression. We conducted a common garden experiment with seed collected from sunflower populations in Kansas and Oklahoma followed by an RNA‐Seq experiment on 95 individuals. The effect of microsatellite allele length on gene expression was assessed across 3,325 microsatellites that could be consistently scored. Our study revealed 479 microsatellites at which allele length significantly correlates with gene expression (eSTRs). When irregular allele sizes not conforming to the motif length were removed, the number of eSTRs rose to 2,379. The percentage of variation in gene expression explained by eSTRs ranged from 1%–86% when controlling for population and allele‐by‐population interaction effects at the 479 eSTRs. Of these eSTRs, 70.4% are in untranslated regions (UTRs). A gene ontology (GO) analysis revealed that eSTRs are significantly enriched for GO terms associated withcis ‐ andtrans ‐regulatory processes. Our findings suggest that a substantial number of transcribed microsatellites can influence gene expression. -
Evolution by natural selection may be effective enough to allow for recurrent, rapid adaptation to distinct niche environments within a well-mixed population. For this to occur, selection must act on standing genetic variation such that mortality i.e. genetic load, is minimized while polymorphism is maintained. Selection on multiple, redundant loci of small effect provides a potentially inexpensive solution. Yet, demonstrating adaptation via redundant, polygenic selection in the wild remains extremely challenging because low per-locus effect sizes and high genetic redundancy severely reduce statistical power. One approach to facilitate identification of loci underlying polygenic selection is to harness natural replicate populations experiencing similar selection pressures that harbor high within-, yet negligible among-population genetic variation. Such populations can be found among the teleost Fundulus heteroclitus. F. heteroclitus inhabits salt marsh estuaries that are characterized by high environmental heterogeneity e.g. tidal ponds, creeks, coastal basins. Here, we sample four of these heterogeneous niches (one coastal basin and three replicate tidal ponds) at two time points from among a single, panmictic F. heteroclitus population. We identify 10,861 single nucleotide polymorphisms using a genotyping-by-sequencing approach and quantify temporal allele frequency change within, as well as spatial divergence among subpopulations residing in these niches. We find a significantly elevated number of concordant allele frequency changes among all subpopulations, suggesting ecosystem-wide adaptation to a common selection pressure. Remarkably, we also find an unexpected number of temporal allele frequency changes that generate fine-scale divergence among subpopulations, suggestive of local adaptation to distinct niche environments. Both patterns are characterized by a lack of large-effect loci yet an elevated total number of significant loci. Adaptation via redundant, polygenic selection offers a likely explanation for these patterns as well as a potential mechanism for polymorphism maintenance in the F. heteroclitus system.more » « less
-
Abstract Identifying the genetic architecture of complex traits is important to many geneticists, including those interested in human disease, plant and animal breeding, and evolutionary genetics. Advances in sequencing technology and statistical methods for genome-wide association studies have allowed for the identification of more variants with smaller effect sizes, however, many of these identified polymorphisms fail to be replicated in subsequent studies. In addition to sampling variation, this failure to replicate reflects the complexities introduced by factors including environmental variation, genetic background, and differences in allele frequencies among populations. Using Drosophila melanogaster wing shape, we ask if we can replicate allelic effects of polymorphisms first identified in a genome-wide association studies in three genes: dachsous, extra-macrochaete, and neuralized, using artificial selection in the lab, and bulk segregant mapping in natural populations. We demonstrate that multivariate wing shape changes associated with these genes are aligned with major axes of phenotypic and genetic variation in natural populations. Following seven generations of artificial selection along the dachsous shape change vector, we observe genetic differentiation of variants in dachsous and genomic regions containing other genes in the hippo signaling pathway. This suggests a shared direction of effects within a developmental network. We also performed artificial selection with the extra-macrochaete shape change vector, which is not a part of the hippo signaling network, but showed a largely shared direction of effects. The response to selection along the emc vector was similar to that of dachsous, suggesting that the available genetic diversity of a population, summarized by the genetic (co)variance matrix (G), influenced alleles captured by selection. Despite the success with artificial selection, bulk segregant analysis using natural populations did not detect these same variants, likely due to the contribution of environmental variation and low minor allele frequencies, coupled with small effect sizes of the contributing variants.