skip to main content


Title: An integrated peach genome structural variation map uncovers genes associated with fruit traits
Abstract Background Genome structural variations (SVs) have been associated with key traits in a wide range of agronomically important species; however, SV profiles of peach and their functional impacts remain largely unexplored. Results Here, we present an integrated map of 202,273 SVs from 336 peach genomes. A substantial number of SVs have been selected during peach domestication and improvement, which together affect 2268 genes. Genome-wide association studies of 26 agronomic traits using these SVs identify a number of candidate causal variants. A 9-bp insertion in Prupe.4G186800 , which encodes a NAC transcription factor, is shown to be associated with early fruit maturity, and a 487-bp deletion in the promoter of PpMYB10.1 is associated with flesh color around the stone. In addition, a 1.67 Mb inversion is highly associated with fruit shape, and a gene adjacent to the inversion breakpoint, PpOFP1 , regulates flat shape formation. Conclusions The integrated peach SV map and the identified candidate genes and variants represent valuable resources for future genomic research and breeding in peach.  more » « less
Award ID(s):
1855585
NSF-PAR ID:
10215594
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Genome Biology
Volume:
21
Issue:
1
ISSN:
1474-760X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Structural variants (SVs) are a major source of genetic variation; and descriptions in natural populations and connections with phenotypic traits are beginning to accumulate in the literature. We integrated advances in genomic sequencing and animal tracking to begin filling this knowledge gap in the Eurasian blackcap. Specifically, we (a) characterized the genome-wide distribution, frequency, and overall fitness effects of SVs using haplotype-resolved assemblies for 79 birds, and (b) used these SVs to study the genetics of seasonal migration. We detected >15 K SVs. Many SVs overlapped repetitive regions and exhibited evidence of purifying selection suggesting they have overall deleterious effects on fitness. We used estimates of genomic differentiation to identify SVs exhibiting evidence of selection in blackcaps with different migratory strategies. Insertions and deletions dominated the SVs we identified and were associated with genes that are either directly (e.g., regulatory motifs that maintain circadian rhythms) or indirectly (e.g., through immune response) related to migration. We also broke migration down into individual traits (direction, distance, and timing) using existing tracking data and tested if genetic variation at the SVs we identified could account for phenotypic variation at these traits. This was only the case for 1 trait—direction—and 1 specific SV (a deletion on chromosome 27) accounted for much of this variation. Our results highlight the evolutionary importance of SVs in natural populations and provide insight into the genetic basis of seasonal migration.

     
    more » « less
  2. Purugganan, Michael (Ed.)
    Abstract Structural variants (SVs) are a largely unstudied feature of plant genome evolution, despite the fact that SVs contribute substantially to phenotypes. In this study, we discovered SVs across a population sample of 347 high-coverage, resequenced genomes of Asian rice (Oryza sativa) and its wild ancestor (O. rufipogon). In addition to this short-read data set, we also inferred SVs from whole-genome assemblies and long-read data. Comparisons among data sets revealed different features of genome variability. For example, genome alignment identified a large (∼4.3 Mb) inversion in indica rice varieties relative to japonica varieties, and long-read analyses suggest that ∼9% of genes from the outgroup (O. longistaminata) are hemizygous. We focused, however, on the resequencing sample to investigate the population genomics of SVs. Clustering analyses with SVs recapitulated the rice cultivar groups that were also inferred from SNPs. However, the site-frequency spectrum of each SV type—which included inversions, duplications, deletions, translocations, and mobile element insertions—was skewed toward lower frequency variants than synonymous SNPs, suggesting that SVs may be predominantly deleterious. Among transposable elements, SINE and mariner insertions were found at especially low frequency. We also used SVs to study domestication by contrasting between rice and O. rufipogon. Cultivated genomes contained ∼25% more derived SVs and mobile element insertions than O. rufipogon, indicating that SVs contribute to the cost of domestication in rice. Peaks of SV divergence were enriched for known domestication genes, but we also detected hundreds of genes gained and lost during domestication, some of which were enriched for traits of agronomic interest. 
    more » « less
  3. Abstract

    Structural variants (SVs) can promote speciation by directly causing reproductive isolation or by suppressing recombination across large genomic regions. Whereas examples of each mechanism have been documented, systematic tests of the role of SVs in speciation are lacking. Here, we take advantage of long‐read (Oxford nanopore) whole‐genome sequencing and a hybrid zone between twoLycaeidesbutterfly taxa (L.melissaand Jackson HoleLycaeides) to comprehensively evaluate genome‐wide patterns of introgression for SVs and relate these patterns to hypotheses about speciation. We found >100,000 SVs segregating within or between the two hybridizing species. SVs and SNPs exhibited similar levels of genetic differentiation between species, with the exception of inversions, which were more differentiated. We detected credible variation in patterns of introgression among SV loci in the hybrid zone, with 562 of 1419 ancestry‐informative SVs exhibiting genomic clines that deviated from null expectations based on genome‐average ancestry. Overall, hybrids exhibited a directional shift towards Jackson HoleLycaeidesancestry at SV loci, consistent with the hypothesis that these loci experienced more selection on average than SNP loci. Surprisingly, we found that deletions, rather than inversions, showed the highest skew towards excess ancestry from Jackson HoleLycaeides. Excess Jackson HoleLycaeidesancestry in hybrids was also especially pronounced for Z‐linked SVs and inversions containing many genes. In conclusion, our results show that SVs are ubiquitous and suggest that SVs in general, but especially deletions, might disproportionately affect hybrid fitness and thus contribute to reproductive isolation.

     
    more » « less
  4. INTRODUCTION Genome-wide association studies (GWASs) have identified thousands of human genetic variants associated with diverse diseases and traits, and most of these variants map to noncoding loci with unknown target genes and function. Current approaches to understand which GWAS loci harbor causal variants and to map these noncoding regulators to target genes suffer from low throughput. With newer multiancestry GWASs from individuals of diverse ancestries, there is a pressing and growing need to scale experimental assays to connect GWAS variants with molecular mechanisms. Here, we combined biobank-scale GWASs, massively parallel CRISPR screens, and single-cell sequencing to discover target genes of noncoding variants for blood trait loci with systematic targeting and inhibition of noncoding GWAS loci with single-cell sequencing (STING-seq). RATIONALE Blood traits are highly polygenic, and GWASs have identified thousands of noncoding loci that map to candidate cis -regulatory elements (CREs). By combining CRE-silencing CRISPR perturbations and single-cell readouts, we targeted hundreds of GWAS loci in a single assay, revealing target genes in cis and in trans . For select CREs that regulate target genes, we performed direct variant insertion. Although silencing the CRE can identify the target gene, direct variant insertion can identify magnitude and direction of effect on gene expression for the GWAS variant. In select cases in which the target gene was a transcription factor or microRNA, we also investigated the gene-regulatory networks altered upon CRE perturbation and how these networks differ across blood cell types. RESULTS We inhibited candidate CREs from fine-mapped blood trait GWAS variants (from ~750,000 individual of diverse ancestries) in human erythroid progenitors. In total, we targeted 543 variants (254 loci) mapping to candidate CREs, generating multimodal single-cell data including transcriptome, direct CRISPR gRNA capture, and cell surface proteins. We identified target genes in cis (within 500 kb) for 134 CREs. In most cases, we found that the target gene was the closest gene and that specific enhancer-associated biochemical hallmarks (H3K27ac and accessible chromatin) are essential for CRE function. Using multiple perturbations at the same locus, we were able to distinguished between causal variants from noncausal variants in linkage disequilibrium. For a subset of validated CREs, we also inserted specific GWAS variants using base-editing STING-seq (beeSTING-seq) and quantified the effect size and direction of GWAS variants on gene expression. Given our transcriptome-wide data, we examined dosage effects in cis and trans in cases in which the cis target is a transcription factor or microRNA. We found that trans target genes are also enriched for GWAS loci, and identified gene clusters within trans gene networks with distinct biological functions and expression patterns in primary human blood cells. CONCLUSION In this work, we investigated noncoding GWAS variants at scale, identifying target genes in single cells. These methods can help to address the variant-to-function challenges that are a barrier for translation of GWAS findings (e.g., drug targets for diseases with a genetic basis) and greatly expand our ability to understand mechanisms underlying GWAS loci. Identifying causal variants and their target genes with STING-seq. Uncovering causal variants and their target genes or function are a major challenge for GWASs. STING-seq combines perturbation of noncoding loci with multimodal single-cell sequencing to profile hundreds of GWAS loci in parallel. This approach can identify target genes in cis and trans , measure dosage effects, and decipher gene-regulatory networks. 
    more » « less
  5. The environment has constantly shaped plant genomes, but the genetic bases underlying how plants adapt to environmental influences remain largely unknown. We constructed a high-density genomic variation map of 263 geographically representative peach landraces and wild relatives. A combination of whole-genome selection scans and genome-wide environmental association studies (GWEAS) was performed to reveal the genomic bases of peach adaptation to diverse climates. A total of 2092 selective sweeps that underlie local adaptation to both mild and extreme climates were identified, including 339 sweeps conferring genomic pattern of adaptation to high altitudes. Using genome-wide environmental association studies (GWEAS), a total of 2755 genomic loci strongly associated with 51 specific environmental variables were detected. The molecular mechanism underlying adaptive evolution of high drought, strong UVB, cold hardiness, sugar content, flesh color, and bloom date were revealed. Finally, based on 30 yr of observation, a candidate gene associated with bloom date advance, representing peach responses to global warming, was identified. Collectively, our study provides insights into molecular bases of how environments have shaped peach genomes by natural selection and adds candidate genes for future studies on evolutionary genetics, adaptation to climate changes, and breeding. 
    more » « less