skip to main content


Title: Genome sequencing sheds light on the contribution of structural variants to Brassica oleracea diversification
Abstract Background Brassica oleracea includes several morphologically diverse, economically important vegetable crops, such as the cauliflower and cabbage. However, genetic variants, especially large structural variants (SVs), that underlie the extreme morphological diversity of B. oleracea remain largely unexplored. Results Here we present high-quality chromosome-scale genome assemblies for two B. oleracea morphotypes, cauliflower and cabbage. Direct comparison of these two assemblies identifies ~ 120 K high-confidence SVs. Population analysis of 271 B. oleracea accessions using these SVs clearly separates different morphotypes, suggesting the association of SVs with B. oleracea intraspecific divergence. Genes affected by SVs selected between cauliflower and cabbage are enriched with functions related to response to stress and stimulus and meristem and flower development. Furthermore, genes affected by selected SVs and involved in the switch from vegetative to generative growth that defines curd initiation, inflorescence meristem proliferation for curd formation, maintenance and enlargement, are identified, providing insights into the regulatory network of curd development. Conclusions This study reveals the important roles of SVs in diversification of different morphotypes of B. oleracea , and the newly assembled genomes and the SVs provide rich resources for future research and breeding.  more » « less
Award ID(s):
1855585
NSF-PAR ID:
10321877
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; « less
Date Published:
Journal Name:
BMC Biology
Volume:
19
Issue:
1
ISSN:
1741-7007
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Genomic structural variants (SVs) can play important roles in adaptation and speciation. Yet the overall fitness effects of SVs are poorly understood, partly because accurate population-level identification of SVs requires multiple high-quality genome assemblies. Here, we use 31 chromosome-scale, haplotype-resolved genome assemblies ofTheobroma cacao—an outcrossing, long-lived tree species that is the source of chocolate—to investigate the fitness consequences of SVs in natural populations. Among the 31 accessions, we find over 160,000 SVs, which together cover eight times more of the genome than single-nucleotide polymorphisms and short indels (125 versus 15 Mb). Our results indicate that a vast majority of these SVs are deleterious: they segregate at low frequencies and are depleted from functional regions of the genome. We show that SVs influence gene expression, which likely impairs gene function and contributes to the detrimental effects of SVs. We also provide empirical support for a theoretical prediction that SVs, particularly inversions, increase genetic load through the accumulation of deleterious nucleotide variants as a result of suppressed recombination. Despite the overall detrimental effects, we identify individual SVs bearing signatures of local adaptation, several of which are associated with genes differentially expressed between populations. Genes involved in pathogen resistance are strongly enriched among these candidates, highlighting the contribution of SVs to this important local adaptation trait. Beyond revealing empirical evidence for the evolutionary importance of SVs, these 31 de novo assemblies provide a valuable resource for genetic and breeding studies inT.cacao.

     
    more » « less
  2. Despite insertions and deletions being the most common structural variants (SVs) found across genomes, not much is known about how much these SVs vary within populations and between closely related species, nor their significance in evolution. To address these questions, we characterized the evolution of indel SVs using genome assemblies of three closely related Heliconius butterfly species. Over the relatively short evolutionary timescales investigated, up to 18.0% of the genome was composed of indels between two haplotypes of an individual Heliconius charithonia butterfly and up to 62.7% included lineage-specific SVs between the genomes of the most distant species (11 Mya). Lineage-specific sequences were mostly characterized as transposable elements (TEs) inserted at random throughout the genome and their overall distribution was similarly affected by linked selection as single nucleotide substitutions. Using chromatin accessibility profiles (i.e., ATAC-seq) of head tissue in caterpillars to identify sequences with potential cis -regulatory function, we found that out of the 31,066 identified differences in chromatin accessibility between species, 30.4% were within lineage-specific SVs and 9.4% were characterized as TE insertions. These TE insertions were localized closer to gene transcription start sites than expected at random and were enriched for sites with significant resemblance to several transcription factor binding sites with known function in neuron development in Drosophila . We also identified 24 TE insertions with head-specific chromatin accessibility. Our results show high rates of structural genome evolution that were previously overlooked in comparative genomic studies and suggest a high potential for structural variation to serve as raw material for adaptive evolution. 
    more » « less
  3. Despite insertions and deletions being the most common structural variants (SVs) found across genomes, not much is known about how much these SVs vary within populations and between closely related species, nor their significance in evolution. To address these questions, we characterized the evolution of indel SVs using genome assemblies of three closely related Heliconius butterfly species. Over the relatively short evolutionary timescales investigated, up to 18.0% of the genome was composed of indels between two haplotypes of an individual H. charithonia butterfly and up to 62.7% included lineage-specific SVs between the genomes of the most distant species (11 Mya). Lineage-specific sequences were mostly characterized as transposable elements (TEs) inserted at random throughout the genome and their overall distribution was similarly affected by linked selection as single nucleotide substitutions. Using chromatin accessibility profiles (i.e., ATAC-seq) of head tissue in caterpillars to identify sequences with potential cis-regulatory function, we found that out of the 31,066 identified differences in chromatin accessibility between species, 30.4% were within lineage-specific SVs and 9.4% were characterized as TE insertions. These TE insertions were localized closer to gene transcription start sites than expected at random and were enriched for several transcription factor binding site candidates with known function in neuron development in Drosophila. We also identified 24 TE insertions with head-specific chromatin accessibility. Our results show high rates of structural genome evolution that were previously overlooked in comparative genomic studies and suggest a high potential for structural variation to serve as raw material for adaptive evolution. 
    more » « less
  4. Abstract

    Structural variants (SVs) are a major source of genetic variation; and descriptions in natural populations and connections with phenotypic traits are beginning to accumulate in the literature. We integrated advances in genomic sequencing and animal tracking to begin filling this knowledge gap in the Eurasian blackcap. Specifically, we (a) characterized the genome-wide distribution, frequency, and overall fitness effects of SVs using haplotype-resolved assemblies for 79 birds, and (b) used these SVs to study the genetics of seasonal migration. We detected >15 K SVs. Many SVs overlapped repetitive regions and exhibited evidence of purifying selection suggesting they have overall deleterious effects on fitness. We used estimates of genomic differentiation to identify SVs exhibiting evidence of selection in blackcaps with different migratory strategies. Insertions and deletions dominated the SVs we identified and were associated with genes that are either directly (e.g., regulatory motifs that maintain circadian rhythms) or indirectly (e.g., through immune response) related to migration. We also broke migration down into individual traits (direction, distance, and timing) using existing tracking data and tested if genetic variation at the SVs we identified could account for phenotypic variation at these traits. This was only the case for 1 trait—direction—and 1 specific SV (a deletion on chromosome 27) accounted for much of this variation. Our results highlight the evolutionary importance of SVs in natural populations and provide insight into the genetic basis of seasonal migration.

     
    more » « less
  5. Purugganan, Michael (Ed.)
    Abstract Structural variants (SVs) are a largely unstudied feature of plant genome evolution, despite the fact that SVs contribute substantially to phenotypes. In this study, we discovered SVs across a population sample of 347 high-coverage, resequenced genomes of Asian rice (Oryza sativa) and its wild ancestor (O. rufipogon). In addition to this short-read data set, we also inferred SVs from whole-genome assemblies and long-read data. Comparisons among data sets revealed different features of genome variability. For example, genome alignment identified a large (∼4.3 Mb) inversion in indica rice varieties relative to japonica varieties, and long-read analyses suggest that ∼9% of genes from the outgroup (O. longistaminata) are hemizygous. We focused, however, on the resequencing sample to investigate the population genomics of SVs. Clustering analyses with SVs recapitulated the rice cultivar groups that were also inferred from SNPs. However, the site-frequency spectrum of each SV type—which included inversions, duplications, deletions, translocations, and mobile element insertions—was skewed toward lower frequency variants than synonymous SNPs, suggesting that SVs may be predominantly deleterious. Among transposable elements, SINE and mariner insertions were found at especially low frequency. We also used SVs to study domestication by contrasting between rice and O. rufipogon. Cultivated genomes contained ∼25% more derived SVs and mobile element insertions than O. rufipogon, indicating that SVs contribute to the cost of domestication in rice. Peaks of SV divergence were enriched for known domestication genes, but we also detected hundreds of genes gained and lost during domestication, some of which were enriched for traits of agronomic interest. 
    more » « less