skip to main content

Title: Genomic structural variants constrain and facilitate adaptation in natural populations of Theobroma cacao , the chocolate tree

Genomic structural variants (SVs) can play important roles in adaptation and speciation. Yet the overall fitness effects of SVs are poorly understood, partly because accurate population-level identification of SVs requires multiple high-quality genome assemblies. Here, we use 31 chromosome-scale, haplotype-resolved genome assemblies ofTheobroma cacao—an outcrossing, long-lived tree species that is the source of chocolate—to investigate the fitness consequences of SVs in natural populations. Among the 31 accessions, we find over 160,000 SVs, which together cover eight times more of the genome than single-nucleotide polymorphisms and short indels (125 versus 15 Mb). Our results indicate that a vast majority of these SVs are deleterious: they segregate at low frequencies and are depleted from functional regions of the genome. We show that SVs influence gene expression, which likely impairs gene function and contributes to the detrimental effects of SVs. We also provide empirical support for a theoretical prediction that SVs, particularly inversions, increase genetic load through the accumulation of deleterious nucleotide variants as a result of suppressed recombination. Despite the overall detrimental effects, we identify individual SVs bearing signatures of local adaptation, several of which are associated with genes differentially expressed between populations. Genes involved in pathogen resistance are strongly enriched among more » these candidates, highlighting the contribution of SVs to this important local adaptation trait. Beyond revealing empirical evidence for the evolutionary importance of SVs, these 31 de novo assemblies provide a valuable resource for genetic and breeding studies inT.cacao.

« less
Authors:
; ; ; ; ;
Publication Date:
NSF-PAR ID:
10292038
Journal Name:
Proceedings of the National Academy of Sciences
Volume:
118
Issue:
35
Page Range or eLocation-ID:
Article No. e2102914118
ISSN:
0027-8424
Publisher:
Proceedings of the National Academy of Sciences
Sponsoring Org:
National Science Foundation
More Like this
  1. Purugganan, Michael (Ed.)
    Abstract Structural variants (SVs) are a largely unstudied feature of plant genome evolution, despite the fact that SVs contribute substantially to phenotypes. In this study, we discovered SVs across a population sample of 347 high-coverage, resequenced genomes of Asian rice (Oryza sativa) and its wild ancestor (O. rufipogon). In addition to this short-read data set, we also inferred SVs from whole-genome assemblies and long-read data. Comparisons among data sets revealed different features of genome variability. For example, genome alignment identified a large (∼4.3 Mb) inversion in indica rice varieties relative to japonica varieties, and long-read analyses suggest that ∼9% of genes from the outgroup (O. longistaminata) are hemizygous. We focused, however, on the resequencing sample to investigate the population genomics of SVs. Clustering analyses with SVs recapitulated the rice cultivar groups that were also inferred from SNPs. However, the site-frequency spectrum of each SV type—which included inversions, duplications, deletions, translocations, and mobile element insertions—was skewed toward lower frequency variants than synonymous SNPs, suggesting that SVs may be predominantly deleterious. Among transposable elements, SINE and mariner insertions were found at especially low frequency. We also used SVs to study domestication by contrasting between rice and O. rufipogon. Cultivated genomes contained ∼25% more derived SVs andmore »mobile element insertions than O. rufipogon, indicating that SVs contribute to the cost of domestication in rice. Peaks of SV divergence were enriched for known domestication genes, but we also detected hundreds of genes gained and lost during domestication, some of which were enriched for traits of agronomic interest.« less
  2. Abstract

    Structural variants (SVs) can promote speciation by directly causing reproductive isolation or by suppressing recombination across large genomic regions. Whereas examples of each mechanism have been documented, systematic tests of the role of SVs in speciation are lacking. Here, we take advantage of long‐read (Oxford nanopore) whole‐genome sequencing and a hybrid zone between twoLycaeidesbutterfly taxa (L.melissaand Jackson HoleLycaeides) to comprehensively evaluate genome‐wide patterns of introgression for SVs and relate these patterns to hypotheses about speciation. We found >100,000 SVs segregating within or between the two hybridizing species. SVs and SNPs exhibited similar levels of genetic differentiation between species, with the exception of inversions, which were more differentiated. We detected credible variation in patterns of introgression among SV loci in the hybrid zone, with 562 of 1419 ancestry‐informative SVs exhibiting genomic clines that deviated from null expectations based on genome‐average ancestry. Overall, hybrids exhibited a directional shift towards Jackson HoleLycaeidesancestry at SV loci, consistent with the hypothesis that these loci experienced more selection on average than SNP loci. Surprisingly, we found that deletions, rather than inversions, showed the highest skew towards excess ancestry from Jackson HoleLycaeides. Excess Jackson HoleLycaeidesancestry in hybrids was also especially pronounced for Z‐linked SVs and inversionsmore »containing many genes. In conclusion, our results show that SVs are ubiquitous and suggest that SVs in general, but especially deletions, might disproportionately affect hybrid fitness and thus contribute to reproductive isolation.

    « less
  3. Abstract

    Adaptation across environmental gradients has been demonstrated in numerous systems with extensive dispersal, despite high gene flow and consequently low genetic structure. The speed and mechanisms by which such adaptation occurs remain poorly resolved, but are critical to understanding species spread and persistence in a changing world. Here, we investigate these mechanisms in the European green crabCarcinus maenas, a globally distributed invader. We focus on a northwestern Pacific population that spread across >12 degrees of latitude in 10 years from a single source, following its introduction <35 years ago. Using six locations spanning >1500 km, we examine genetic structure using 9376 single nucleotide polymorphisms (SNPs). We find high connectivity among five locations, with significant structure between these locations and an enclosed lagoon with limited connectivity to the coast. Among the five highly connected locations, the only structure observed was a cline driven by a handful of SNPs strongly associated with latitude and winter temperature. These SNPs are almost exclusively found in a large cluster of genes in strong linkage disequilibrium that was previously identified as a candidate for cold tolerance adaptation in this species. This region may represent a balanced polymorphism that evolved to promote rapid adaptation in variable environments despite high gene flow,more »and which now contributes to successful invasion and spread in a novel environment. This research suggests an answer to the paradox of genetically depauperate yet successful invaders: populations may be able to adapt via a few variants of large effect despite low overall diversity.

    « less
  4. Abstract Sage-grouse are two closely related iconic species of the North American West, with historically broad distributions across sagebrush-steppe habitat. Both species are dietary specialists on sagebrush during winter, with presumed adaptations to tolerate the high concentrations of toxic secondary metabolites that function as plant chemical defenses. Marked range contraction and declining population sizes since European settlement have motivated efforts to identify distinct population genetic variation, particularly that which might be associated with local genetic adaptation and dietary specialization of sage-grouse. We assembled a reference genome and performed whole-genome sequencing across sage-grouse from six populations, encompassing both species and including several populations on the periphery of the species ranges. Population genomic analyses reaffirmed genome-wide differentiation between greater and Gunnison sage-grouse, revealed pronounced intraspecific population structure, and highlighted important differentiation of a small isolated population of greater sage-grouse in the northwest of the range. Patterns of genome-wide differentiation were largely consistent with a hypothesized role of genetic drift due to limited gene flow among populations. Inferred ancient population demography suggested persistent declines in effective population sizes that have likely contributed to differentiation within and among species. Several genomic regions with single-nucleotide polymorphisms exhibiting extreme population differentiation were associated with candidate genesmore »linked to metabolism of xenobiotic compounds. In vitro activity of enzymes isolated from sage-grouse livers supported a role for these genes in detoxification of sagebrush, suggesting that the observed interpopulation variation may underlie important local dietary adaptations, warranting close consideration for conservation strategies that link sage-grouse to the chemistry of local sagebrush.« less
  5. Chapman, Mark (Ed.)
    Abstract Populations along steep environmental gradients are subject to differentiating selection that can result in local adaptation, despite countervailing gene flow, and genetic drift. In montane systems, where species are often restricted to narrow ranges of elevation, it is unclear whether the selection is strong enough to influence functional differentiation of subpopulations differing by a few hundred meters in elevation. We used targeted capture of 12 501 exons from across the genome, including 271 genes previously implicated in altitude adaptation, to test for adaptation to local elevations for 2 highland hummingbird species, Coeligena violifer (n = 62) and Colibri coruscans (n = 101). For each species, we described population genetic structure across the complex geography of the Peruvian Andes and, while accounting for this structure, we tested whether elevational allele frequency clines in single nucleotide polymorphisms (SNPs) showed evidence for local adaptation to elevation. Although the 2 species exhibited contrasting population genetic structures, we found signatures of clinal genetic variation with shifts in elevation in both. The genes with SNP-elevation associations included candidate genes previously discovered for high-elevation adaptation as well as others not previously identified, with cellular functions related to hypoxia response, energy metabolism, and immune function, among others.more »Despite the homogenizing effects of gene flow and genetic drift, natural selection on parts of the genome evidently optimizes elevation-specific cellular function even within elevation range-restricted montane populations. Consequently, our results suggest local adaptation occurring in narrow elevation bands in tropical mountains, such as the Andes, may effectively make them “taller” biogeographic barriers.« less