skip to main content

Title: Genomic structural variants constrain and facilitate adaptation in natural populations of Theobroma cacao , the chocolate tree

Genomic structural variants (SVs) can play important roles in adaptation and speciation. Yet the overall fitness effects of SVs are poorly understood, partly because accurate population-level identification of SVs requires multiple high-quality genome assemblies. Here, we use 31 chromosome-scale, haplotype-resolved genome assemblies ofTheobroma cacao—an outcrossing, long-lived tree species that is the source of chocolate—to investigate the fitness consequences of SVs in natural populations. Among the 31 accessions, we find over 160,000 SVs, which together cover eight times more of the genome than single-nucleotide polymorphisms and short indels (125 versus 15 Mb). Our results indicate that a vast majority of these SVs are deleterious: they segregate at low frequencies and are depleted from functional regions of the genome. We show that SVs influence gene expression, which likely impairs gene function and contributes to the detrimental effects of SVs. We also provide empirical support for a theoretical prediction that SVs, particularly inversions, increase genetic load through the accumulation of deleterious nucleotide variants as a result of suppressed recombination. Despite the overall detrimental effects, we identify individual SVs bearing signatures of local adaptation, several of which are associated with genes differentially expressed between populations. Genes involved in pathogen resistance are strongly enriched among more » these candidates, highlighting the contribution of SVs to this important local adaptation trait. Beyond revealing empirical evidence for the evolutionary importance of SVs, these 31 de novo assemblies provide a valuable resource for genetic and breeding studies inT.cacao.

« less
; ; ; ; ;
Publication Date:
Journal Name:
Proceedings of the National Academy of Sciences
Page Range or eLocation-ID:
Article No. e2102914118
Proceedings of the National Academy of Sciences
Sponsoring Org:
National Science Foundation
More Like this
  1. Purugganan, Michael (Ed.)
    Abstract Structural variants (SVs) are a largely unstudied feature of plant genome evolution, despite the fact that SVs contribute substantially to phenotypes. In this study, we discovered SVs across a population sample of 347 high-coverage, resequenced genomes of Asian rice (Oryza sativa) and its wild ancestor (O. rufipogon). In addition to this short-read data set, we also inferred SVs from whole-genome assemblies and long-read data. Comparisons among data sets revealed different features of genome variability. For example, genome alignment identified a large (∼4.3 Mb) inversion in indica rice varieties relative to japonica varieties, and long-read analyses suggest that ∼9% of genes from the outgroup (O. longistaminata) are hemizygous. We focused, however, on the resequencing sample to investigate the population genomics of SVs. Clustering analyses with SVs recapitulated the rice cultivar groups that were also inferred from SNPs. However, the site-frequency spectrum of each SV type—which included inversions, duplications, deletions, translocations, and mobile element insertions—was skewed toward lower frequency variants than synonymous SNPs, suggesting that SVs may be predominantly deleterious. Among transposable elements, SINE and mariner insertions were found at especially low frequency. We also used SVs to study domestication by contrasting between rice and O. rufipogon. Cultivated genomes contained ∼25% more derived SVs andmore »mobile element insertions than O. rufipogon, indicating that SVs contribute to the cost of domestication in rice. Peaks of SV divergence were enriched for known domestication genes, but we also detected hundreds of genes gained and lost during domestication, some of which were enriched for traits of agronomic interest.« less
  2. Abstract Sage-grouse are two closely related iconic species of the North American West, with historically broad distributions across sagebrush-steppe habitat. Both species are dietary specialists on sagebrush during winter, with presumed adaptations to tolerate the high concentrations of toxic secondary metabolites that function as plant chemical defenses. Marked range contraction and declining population sizes since European settlement have motivated efforts to identify distinct population genetic variation, particularly that which might be associated with local genetic adaptation and dietary specialization of sage-grouse. We assembled a reference genome and performed whole-genome sequencing across sage-grouse from six populations, encompassing both species and including several populations on the periphery of the species ranges. Population genomic analyses reaffirmed genome-wide differentiation between greater and Gunnison sage-grouse, revealed pronounced intraspecific population structure, and highlighted important differentiation of a small isolated population of greater sage-grouse in the northwest of the range. Patterns of genome-wide differentiation were largely consistent with a hypothesized role of genetic drift due to limited gene flow among populations. Inferred ancient population demography suggested persistent declines in effective population sizes that have likely contributed to differentiation within and among species. Several genomic regions with single-nucleotide polymorphisms exhibiting extreme population differentiation were associated with candidate genesmore »linked to metabolism of xenobiotic compounds. In vitro activity of enzymes isolated from sage-grouse livers supported a role for these genes in detoxification of sagebrush, suggesting that the observed interpopulation variation may underlie important local dietary adaptations, warranting close consideration for conservation strategies that link sage-grouse to the chemistry of local sagebrush.« less
  3. Chapman, Mark (Ed.)
    Abstract Populations along steep environmental gradients are subject to differentiating selection that can result in local adaptation, despite countervailing gene flow, and genetic drift. In montane systems, where species are often restricted to narrow ranges of elevation, it is unclear whether the selection is strong enough to influence functional differentiation of subpopulations differing by a few hundred meters in elevation. We used targeted capture of 12 501 exons from across the genome, including 271 genes previously implicated in altitude adaptation, to test for adaptation to local elevations for 2 highland hummingbird species, Coeligena violifer (n = 62) and Colibri coruscans (n = 101). For each species, we described population genetic structure across the complex geography of the Peruvian Andes and, while accounting for this structure, we tested whether elevational allele frequency clines in single nucleotide polymorphisms (SNPs) showed evidence for local adaptation to elevation. Although the 2 species exhibited contrasting population genetic structures, we found signatures of clinal genetic variation with shifts in elevation in both. The genes with SNP-elevation associations included candidate genes previously discovered for high-elevation adaptation as well as others not previously identified, with cellular functions related to hypoxia response, energy metabolism, and immune function, among others.more »Despite the homogenizing effects of gene flow and genetic drift, natural selection on parts of the genome evidently optimizes elevation-specific cellular function even within elevation range-restricted montane populations. Consequently, our results suggest local adaptation occurring in narrow elevation bands in tropical mountains, such as the Andes, may effectively make them “taller” biogeographic barriers.« less
  4. Abstract

    The study of local adaptation in the presence of ongoing gene flow is the study of natural selection in action, revealing the functional genetic diversity most relevant to contemporary pressures. In addition to individual genes, genome-wide architecture can itself evolve to enable adaptation. Distributed across a steep thermal gradient along the east coast of North America, Atlantic silversides (Menidia menidia) exhibit an extraordinary degree of local adaptation in a suite of traits, and the capacity for rapid adaptation from standing genetic variation, but we know little about the patterns of genomic variation across the species range that enable this remarkable adaptability. Here, we use low-coverage, whole-transcriptome sequencing of Atlantic silversides sampled along an environmental cline to show marked signatures of divergent selection across a gradient of neutral differentiation. Atlantic silversides sampled across 1371 km of the southern section of its distribution have very low genome-wide differentiation (median FST = 0.006 across 1.9 million variants), consistent with historical connectivity and observations of recent migrants. Yet almost 14,000 single nucleotide polymorphisms (SNPs) are nearly fixed (FST > 0.95) for alternate alleles. Highly differentiated SNPs cluster into four tight linkage disequilibrium (LD) blocks that span hundreds of genes and several megabases. Variants in thesemore »LD blocks are disproportionately nonsynonymous and concentrated in genes enriched for multiple functions related to known adaptations in silversides, including variation in lipid storage, metabolic rate, and spawning behavior. Elevated levels of absolute divergence and demographic modeling suggest selection maintaining divergence across these blocks under gene flow. These findings represent an extreme case of heterogeneity in levels of differentiation across the genome, and highlight how gene flow shapes genomic architecture in continuous populations. Locally adapted alleles may be common features of populations distributed along environmental gradients, and will likely be key to conserving variation to enable future responses to environmental change.

    « less
  5. Identifying the genetic basis of adaptation is a central goal of evolutionary biology. However, identifying genes and mutations affecting fitness remains challenging because a large number of traits and variants can influence fitness. Selected phenotypes can also be difficult to know a priori , complicating top–down genetic approaches for trait mapping that involve crosses or genome-wide association studies. In such cases, experimental genetic approaches, where one maps fitness directly and attempts to infer the traits involved afterwards, can be valuable. Here, we re-analyse data from a transplant experiment involving Timema stick insects, where five physically clustered single-nucleotide polymorphisms associated with cryptic body coloration were shown to interact to affect survival. Our analysis covers a larger genomic region than past work and revealed a locus previously not identified as associated with survival. This locus resides near a gene, Punch ( Pu ) , involved in pteridine pigments production, implying that it could be associated with an unmeasured coloration trait. However, by combining previous and newly obtained phenotypic data, we show that this trait is not eye or body coloration. We discuss the implications of our results for the discovery of traits, genes and mutations associated with fitness in other systems, asmore »well as for supergene evolution. This article is part of the theme issue ‘Genetic basis of adaptation and speciation: from loci to causative mutations’.« less