skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on November 22, 2025

Title: Reconstruction of the human amylase locus reveals ancient duplications seeding modern-day variation
Previous studies suggested that the copy number of the human salivary amylase gene,AMY1, correlates with starch-rich diets. However, evolutionary analyses are hampered by the absence of accurate, sequence-resolved haplotype variation maps. We identified 30 structurally distinct haplotypes at nucleotide resolution among 98 present-day humans, revealing that the coding sequences ofAMY1copies are evolving under negative selection. Genomic analyses of these haplotypes in archaic hominins and ancient human genomes suggest that a common three-copy haplotype, dating as far back as 800,000 years ago, has seeded rapidly evolving rearrangements through recurrent nonallelic homologous recombination. Additionally, haplotypes with more than threeAMY1copies have significantly increased in frequency among European farmers over the past 4000 years, potentially as an adaptive response to increased starch digestion.  more » « less
Award ID(s):
2049947
PAR ID:
10571731
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; « less
Corporate Creator(s):
Publisher / Repository:
AAAS
Date Published:
Journal Name:
Science
Volume:
386
Issue:
6724
ISSN:
0036-8075
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract The adoption of agriculture triggered a rapid shift towards starch-rich diets in human populations1. Amylase genes facilitate starch digestion, and increased amylase copy number has been observed in some modern human populations with high-starch intake2, although evidence of recent selection is lacking3,4. Here, using 94 long-read haplotype-resolved assemblies and short-read data from approximately 5,600 contemporary and ancient humans, we resolve the diversity and evolutionary history of structural variation at the amylase locus. We find that amylase genes have higher copy numbers in agricultural populations than in fishing, hunting and pastoral populations. We identify 28 distinct amylase structural architectures and demonstrate that nearly identical structures have arisen recurrently on different haplotype backgrounds throughout recent human history.AMY1andAMY2Agenes each underwent multiple duplication/deletion events with mutation rates up to more than 10,000-fold the single-nucleotide polymorphism mutation rate, whereasAMY2Bgene duplications share a single origin. Using a pangenome-based approach, we infer structural haplotypes across thousands of humans identifying extensively duplicated haplotypes at higher frequency in modern agricultural populations. Leveraging 533 ancient human genomes, we find that duplication-containing haplotypes (with more gene copies than the ancestral haplotype) have rapidly increased in frequency over the past 12,000 years in West Eurasians, suggestive of positive selection. Together, our study highlights the potential effects of the agricultural revolution on human genomes and the importance of structural variation in human adaptation. 
    more » « less
  2. Many mammals can digest starch by using an enzyme called amylase, but different species eat different amounts of starchy foods. Amylase is released by the pancreas, and in certain species such as humans, it is also created by the glands that produce saliva, allowing the enzyme to be present in the mouth. There, amylase can start to break down starch, releasing a sweet taste that helps the animal to detect starchy foods. Curiously, humans have multiple copies of the gene that codes for the enzyme, but the exact number varies between people. Previous research has found that populations with more copies also eat more starch; if this correlation also existed in other species, it could help to understand how diets influence and shape genetic information. In addition, it is unclear how amylase came to be present in saliva, as the ancestors of mammals only produced the protein in the pancreas. Pajic et al. analyzed the genomes of a range of mammals and found that the more starch a species had in its diet, the more amylase gene copies it harbored in its genome. In fact, unrelated mammals living in different habitats and eating different types of food have similar numbers of amylase gene copies if they have the same level of starch in their diet. In addition, Pajic et al. discovered that animals such as mice, rats, pigs and dogs, which have lived in close contact with people for thousands of years, quickly adapted to the large amount of starch present in human food. In each of these species, a mechanism called gene duplication independently created new copies of the amylase gene. This could represent the first step towards some of these copies becoming active in the glands that release saliva. In people, having fewer copies of the amylase gene could mean they have a higher risk for diabetes; this number is also tied to the composition of the collection of bacteria that live in the mouth and the gut. Understanding how the copy number of the amylase gene affects biology will help to grasp how it also affects health and wellbeing, in humans and in our four-legged companions. 
    more » « less
  3. Abstract Variation among functionally similar species in their response to environmental stress buffers ecosystems from changing states. Functionally similar species may often be cryptic species representing evolutionarily distinct genetic lineages that are morphologically indistinguishable. However, the extent to which cryptic species differ in their response to stress, and could therefore provide a source of response diversity, remains unclear because they are often not identified or are assumed to be ecologically equivalent. Here, we uncover differences in the bleaching response between sympatric cryptic species of the common Indo‐Pacific coral,Pocillopora. In April 2019, prolonged ocean heating occurred at Moorea, French Polynesia. 72% of pocilloporid colonies bleached after 22 d of severe heating (>8oC‐days) at 10 m depth on the north shore fore reef. Colony mortality ranged from 11% to 42% around the island four months after heating subsided. The majority (86%) of pocilloporids that died from bleaching belonged to a single haplotype, despite twelve haplotypes, representing at least five species, being sampled. Mitochondrial (open reading frame) sequence variation was greater between the haplotypes that experienced mortality versus haplotypes that all survived than it was between nominal species that all survived. Colonies > 30 cm in diameter were identified as the haplotype experiencing the most mortality, and in 1125 colonies that were not genetically identified, bleaching and mortality increased with colony size. Mortality did not increase with colony size within the haplotype suffering the highest mortality, suggesting that size‐dependent bleaching and mortality at the genus level was caused instead by differences among cryptic species. The relative abundance of haplotypes shifted between February and August, driven by declines in the same common haplotype for which mortality was estimated directly, at sites where heat accumulation was greatest, and where larger colony sizes occurred. The identification of morphologically indistinguishable species that differ in their response to thermal stress, but share a similar ecological function in terms of maintaining a coral‐dominated state, has important consequences for uncovering response diversity that drives resilience, especially in systems with low or declining functional diversity. 
    more » « less
  4. Abstract— Like many fern lineages comprising reticulate species complexes, Polypodium s.s. (Polypodiacaeae) has a history shaped by rapid diversification, hybridization, and polyploidy that poses substantial challenges for phylogenetic inference with plastid and single-locus nuclear markers. Using target capture probes for 408 nuclear loci developed by the GoFlag project and a custom bioinformatic pipeline, SORTER, we constructed multi-locus nuclear datasets for diploid temperate and Mesoamerican species of Polypodium and five allotetraploid species belonging to the well-studied Polypodium vulgare complex. SORTER employs a clustering approach to separate putatively paralogous copies of targeted loci into orthologous matrices and haplotype phasing to infer allopolyploid haplotypes across loci, resulting in datasets amenable to both concatenated maximum likelihood and multi-species coalescent phylogenetic analyses. By comparing phylogenies derived from maximum likelihood and multi-species coalescent analyses of unphased and phased datasets, as well as evaluating discordance among gene trees and species trees, we recover support for incomplete lineage sorting within Polypodium s.s., novel relationships among diploid taxa of the Polypodium vulgare complex and its Mesoamerican sister clade, and the placement of several Polypodium species within other genera. Additionally, we were able to infer well-supported phylogenies that identified the hypothesized progenitors of the allotetraploid species, indicating that SORTER is an effective and accurate tool for reconstructing homeolog haplotypes of allopolyploids in fern taxa and other non-model organisms from target capture data. 
    more » « less
  5. SUMMARY Self‐incompatibility inPetuniais controlled by the polymorphicS‐locus, which containsS‐RNaseencoding the pistil determinant and 16–20S‐locus F‐box(SLF) genes collectively encoding the pollen determinant. Here we sequenced and assembled approximately 3.1 Mb of theS2‐haplotype of theS‐locus inPetunia inflatausing bacterial artificial chromosome clones collectively containing all 17SLFgenes,SLFLike1, andS‐RNase. TwoSLFpseudogenes and 28 potential protein‐coding genes were identified, 20 of which were also found at theS‐loci of both theS6a‐haplotype ofP. inflataand theSN‐haplotype of self‐compatiblePetunia axillaris, but not in theS‐locus remnants of self‐compatible potato (Solanum tuberosum) and tomato (Solanum lycopersicum). Comparative analyses ofS‐locus sequences of these threeS‐haplotypes revealed potential genetic exchange in the flanking regions ofSLFgenes, resulting in highly similar flanking regions between different types ofSLFand between alleles of the same type ofSLFof differentS‐haplotypes. The high degree of sequence similarity in the flanking regions could often be explained by the presence of similar long terminal repeat retroelements, which were enriched at theS‐loci of all threeS‐haplotypes and in the flanking regions of allS‐locus genes examined. We also found evidence of the association of transposable elements withSLFpseudogenes. Based on the hypothesis thatSLFgenes were derived by retrotransposition, we identified 10F‐boxgenes as putativeSLFparent genes. Our results shed light on the importance of non‐coding sequences in the evolution of theS‐locus, and on possible evolutionary mechanisms of generation, proliferation, and deletion ofSLFgenes. 
    more » « less