skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on December 1, 2025

Title: Leveraging ancient DNA to uncover signals of natural selection in Europe lost due to admixture or drift
Abstract Large ancient DNA (aDNA) studies offer the chance to examine genomic changes over time, providing direct insights into human evolution. While recent studies have used time-stratified aDNA for selection scans, most focus on single-locus methods. We conducted a multi-locus genotype scan on 708 samples spanning 7000 years of European history. We show that the G12 statistic, originally designed for unphased diploid data, can effectively detect selection in aDNA processed to create ‘pseudo-haplotypes’. In simulations and at known positive control loci (e.g., lactase persistence), G12 outperforms the allele frequency-based selection statistic, SweepFinder2, previously used on aDNA. Applying our approach, we identified 14 candidate regions of selection across four time periods, with half the signals detectable only in the earliest period. Our findings suggest that selective events in European prehistory, including from the onset of animal domestication, have been obscured by neutral processes like genetic drift and demographic shifts such as admixture.  more » « less
Award ID(s):
2240098
PAR ID:
10563726
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Nature Communications
Date Published:
Journal Name:
Nature Communications
Volume:
15
Issue:
1
ISSN:
2041-1723
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Kim, Yuseob (Ed.)
    Abstract Natural selection leaves a spatial pattern along the genome, with a haplotype distribution distortion near the selected locus that fades with distance. Evaluating the spatial signal of a population-genetic summary statistic across the genome allows for patterns of natural selection to be distinguished from neutrality. Considering the genomic spatial distribution of multiple summary statistics is expected to aid in uncovering subtle signatures of selection. In recent years, numerous methods have been devised that consider genomic spatial distributions across summary statistics, utilizing both classical machine learning and deep learning architectures. However, better predictions may be attainable by improving the way in which features are extracted from these summary statistics. We apply wavelet transform, multitaper spectral analysis, and S-transform to summary statistic arrays to achieve this goal. Each analysis method converts one-dimensional summary statistic arrays to two-dimensional images of spectral analysis, allowing simultaneous temporal and spectral assessment. We feed these images into convolutional neural networks and consider combining models using ensemble stacking. Our modeling framework achieves high accuracy and power across a diverse set of evolutionary settings, including population size changes and test sets of varying sweep strength, softness, and timing. A scan of central European whole-genome sequences recapitulated well-established sweep candidates and predicted novel cancer-associated genes as sweeps with high support. Given that this modeling framework is also robust to missing genomic segments, we believe that it will represent a welcome addition to the population-genomic toolkit for learning about adaptive processes from genomic data. 
    more » « less
  2. Abstract Much research on the evolution of altruism via kin selection, group selection, and reciprocity focuses on the role of a single locus or quantitative trait. Very few studies have explored how linked selection, or selection at loci neighboring an altruism locus, impacts the evolution of altruism. While linked selection can decrease the efficacy of selection at neighboring loci, it might have other effects including promoting selection for altruism by increasing relatedness in regions of low recombination. Here, we used population genetic simulations to study how negative selection at linked loci, or background selection, affects the evolution of altruism. When altruism occurs between full siblings, we found that background selection interfered with selection on the altruistic allele, increasing its fixation probability when the altruistic allele was disfavored and reducing its fixation when the allele was favored. In other words, background selection has the same effect on altruistic genes in family‐structured populations as it does on other, nonsocial, genes. This contrasts with prior research showing that linked selective sweeps can favor the evolution of cooperation, and we discuss possibilities for resolving these contrasting results. 
    more » « less
  3. Abstract The ability to accurately quantify the simultaneous effect of multiple genomic loci on multiple traits is now possible due to current and emerging high‐throughput genotyping and phenotyping technologies. To date, most efforts to quantify these genotype‐to‐phenotype relationships have focused on either multi‐trait models that test a single marker at a time or multi‐locus models that quantify associations with a single trait. Therefore, the purpose of this study was to compare the performance of a multi‐trait, multi‐locus stepwise (MSTEP) model selection procedure we developed to (a) a commonly used multi‐trait single‐locus model and (b) a univariate multi‐locus model. We used real marker data in maize (Zea maysL.) and soybean (Glycine maxL.) to simulate multiple traits controlled by various combinations of pleiotropic and nonpleiotropic quantitative trait nucleotides (QTNs). In general, we found that both multi‐trait models outperformed the univariate multi‐locus model, especially when analyzing a trait of low heritability. For traits controlled by either a combination of pleiotropic and nonpleiotropic QTNs or a large number of QTNs (i.e., 50), our MSTEP model often outperformed at least one of the two alternative models. When applied to the analysis of two tocochromanol‐related traits in maize grain, MSTEP identified the same peak‐associated marker that has been reported in a previous study. We therefore conclude that MSTEP is a useful addition to the suite of statistical models that are commonly used to gain insight into the genetic architecture of agronomically important traits. 
    more » « less
  4. Kim, Yuseob (Ed.)
    Abstract Selective sweeps are frequent and varied signatures in the genomes of natural populations, and detecting them is consequently important in understanding mechanisms of adaptation by natural selection. Following a selective sweep, haplotypic diversity surrounding the site under selection decreases, and this deviation from the background pattern of variation can be applied to identify sweeps. Multiple methods exist to locate selective sweeps in the genome from haplotype data, but none leverages the power of a model-based approach to make their inference. Here, we propose a likelihood ratio test statistic T to probe whole-genome polymorphism data sets for selective sweep signatures. Our framework uses a simple but powerful model of haplotype frequency spectrum distortion to find sweeps and additionally make an inference on the number of presently sweeping haplotypes in a population. We found that the T statistic is suitable for detecting both hard and soft sweeps across a variety of demographic models, selection strengths, and ages of the beneficial allele. Accordingly, we applied the T statistic to variant calls from European and sub-Saharan African human populations, yielding primarily literature-supported candidates, including LCT, RSPH3, and ZNF211 in CEU, SYT1, RGS18, and NNT in YRI, and HLA genes in both populations. We also searched for sweep signatures in Drosophila melanogaster, finding expected candidates at Ace, Uhg1, and Pimet. Finally, we provide open-source software to compute the T statistic and the inferred number of presently sweeping haplotypes from whole-genome data. 
    more » « less
  5. Stajich, J (Ed.)
    Abstract Studying the signatures of evolution can help to understand genetic processes. Here, we demonstrate how the existence of balancing selection can be used to identify the breeding systems of fungi from genomic data. The breeding systems of fungi are controlled by self-incompatibility loci that determine mating types between potential mating partners, resulting in strong balancing selection at the loci. Within the fungal phylum Basidiomycota, two such self-incompatibility loci, namely HD MAT locus and P/R MAT locus, control mating types of gametes. Loss of function at one or both MAT loci results in different breeding systems and relaxes the MAT locus from balancing selection. By investigating the signatures of balancing selection at MAT loci, one can infer a species’ breeding system without culture-based studies. Nevertheless, the extreme sequence divergence among MAT alleles imposes challenges for retrieving full variants from both alleles when using the conventional read-mapping method. Therefore, we employed a combination of read-mapping and local de novo assembly to construct haplotypes of HD MAT alleles from genomes in suilloid fungi (genera Suillus and Rhizopogon). Genealogy and pairwise divergence of HD MAT alleles showed that the origins of mating types predate the split between these two closely related genera. High sequence divergence, trans-specific polymorphism, and the deeply diverging genealogy confirm the long-term functionality and multiallelic status of HD MAT locus in suilloid fungi. This work highlights a genomics approach to studying breeding systems regardless of the culturability of organisms based on the interplay between evolution and genetics. 
    more » « less