skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A super-pangenome of the North American wild grape species
Abstract BackgroundCapturing the genetic diversity of wild relatives is crucial for improving crops because wild species are valuable sources of agronomic traits that are essential to enhance the sustainability and adaptability of domesticated cultivars. Genetic diversity across a genus can be captured in super-pangenomes, which provide a framework for interpreting genomic variations. ResultsHere we report the sequencing, assembly, and annotation of nine wild North American grape genomes, which are phased and scaffolded at chromosome scale. We generate a reference-unbiased super-pangenome using pairwise whole-genome alignment methods, revealing the extent of the genomic diversity among wild grape species from sequence to gene level. The pangenome graph captures genomic variation between haplotypes within a species and across the different species, and it accurately assesses the similarity of hybrids to their parents. The species selected to build the pangenome are a great representation of the genus, as illustrated by capturing known allelic variants in the sex-determining region and for Pierce’s disease resistance loci. Using pangenome-wide association analysis, we demonstrate the utility of the super-pangenome by effectively mapping short reads from genus-wide samples and identifying loci associated with salt tolerance in natural populations of grapes. ConclusionsThis study highlights how a reference-unbiased super-pangenome can reveal the genetic basis of adaptive traits from wild relatives and accelerate crop breeding research.  more » « less
Award ID(s):
1741627
PAR ID:
10480490
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
Genome Biology
Volume:
24
Issue:
1
ISSN:
1474-760X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Effective utilization of wild relatives is key to overcoming challenges in genetic improvement of cultivated tomato, which has a narrow genetic basis; however, current efforts to decipher high-quality genomes for tomato wild species are insufficient. Here, we report chromosome-scale tomato genomes from nine wild species and two cultivated accessions, representative of Solanum section Lycopersicon , the tomato clade. Together with two previously released genomes, we elucidate the phylogeny of Lycopersicon and construct a section-wide gene repertoire. We reveal the landscape of structural variants and provide entry to the genomic diversity among tomato wild relatives, enabling the discovery of a wild tomato gene with the potential to increase yields of modern cultivated tomatoes. Construction of a graph-based genome enables structural-variant-based genome-wide association studies, identifying numerous signals associated with tomato flavor-related traits and fruit metabolites. The tomato super-pangenome resources will expedite biological studies and breeding of this globally important crop. 
    more » « less
  2. Alkan, Can (Ed.)
    Abstract MotivationPangenome graphs offer a comprehensive way of capturing genomic variability across multiple genomes. However, current construction methods often introduce biases, excluding complex sequences or relying on references. The PanGenome Graph Builder (PGGB) addresses these issues. To date, though, there is no state-of-the-art pipeline allowing for easy deployment, efficient and dynamic use of available resources, and scalable usage at the same time. ResultsTo overcome these limitations, we present nf-core/pangenome, a reference-unbiased approach implemented in Nextflow following nf-core’s best practices. Leveraging biocontainers ensures portability and seamless deployment in High-Performance Computing (HPC) environments. Unlike PGGB, nf-core/pangenome distributes alignments across cluster nodes, enabling scalability. Demonstrating its efficiency, we constructed pangenome graphs for 1000 human chromosome 19 haplotypes and 2146 Escherichia coli sequences, achieving a two to threefold speedup compared to PGGB without increasing greenhouse gas emissions. Availability and implementationnf-core/pangenome is released under the MIT open-source license, available on GitHub and Zenodo, with documentation accessible at https://nf-co.re/pangenome/docs/usage. 
    more » « less
  3. Abstract BackgroundPrioritizing wild relative diversity for improving crop adaptation to emerging drought-prone environments is challenging. Here, we combine the genome-wide environmental scans (GWES) in wheat diploid ancestorAegilops tauschii(Ae. tauschii) with allele testing in the genetic backgrounds of adapted cultivars to identify diversity for improving wheat adaptation to water-limiting conditions. ResultsWe evaluate the adaptive allele effects inAe. tauschii-wheat introgression lines phenotyped for multiple traits under irrigated and water-limiting conditions using both unmanned aerial system-based imaging and conventional approaches. The GWES show that climatic gradients alone explain more than half of genomic variation inAe. tauschii, with many alleles associated with climatic factors inAe. tauschiibeing linked with improved performance of introgression lines under water-limiting conditions. We find that the most significant GWES signals associated with temperature annual range in the wild relative are linked with reduced canopy temperature in introgression lines and increased yield. ConclusionsOur results suggest that introgression of climate-adaptive alleles fromAe. tauschiihas the potential to improve wheat performance under water-limiting conditions, and that variants controlling physiological processes responsible for maintaining leaf temperature are likely among the targets of adaptive selection in a wild relative. Adaptive variation uncovered by GWES in wild relatives has the potential to improve climate resilience of crop varieties. 
    more » « less
  4. Abstract Missing heritability in genome-wide association studies defines a major problem in genetic analyses of complex biological traits 1,2 . The solution to this problem is to identify all causal genetic variants and to measure their individual contributions 3,4 . Here we report a graph pangenome of tomato constructed by precisely cataloguing more than 19 million variants from 838 genomes, including 32 new reference-level genome assemblies. This graph pangenome was used for genome-wide association study analyses and heritability estimation of 20,323 gene-expression and metabolite traits. The average estimated trait heritability is 0.41 compared with 0.33 when using the single linear reference genome. This 24% increase in estimated heritability is largely due to resolving incomplete linkage disequilibrium through the inclusion of additional causal structural variants identified using the graph pangenome. Moreover, by resolving allelic and locus heterogeneity, structural variants improve the power to identify genetic factors underlying agronomically important traits leading to, for example, the identification of two new genes potentially contributing to soluble solid content. The newly identified structural variants will facilitate genetic improvement of tomato through both marker-assisted selection and genomic selection. Our study advances the understanding of the heritability of complex traits and demonstrates the power of the graph pangenome in crop breeding. 
    more » « less
  5. IntroductionThroughout domestication, crop plants have gone through strong genetic bottlenecks, dramatically reducing the genetic diversity in today’s available germplasm. This has also reduced the diversity in traits necessary for breeders to develop improved varieties. Many strategies have been developed to improve both genetic and trait diversity in crops, from backcrossing with wild relatives, to chemical/radiation mutagenesis, to genetic engineering. However, even with recent advances in genetic engineering we still face the rate limiting step of identifying which genes and mutations we should target to generate diversity in specific traits. MethodsHere, we apply a comparative evolutionary approach, pairing phylogenetic and expression analyses to identify potential candidate genes for diversifying soybean (Glycine max) canopy cover development via the nuclear auxin signaling gene families, while minimizing pleiotropic effects in other tissues. In soybean, rapid canopy cover development is correlated with yield and also suppresses weeds in organic cultivation. Results and discussionWe identified genes most specifically expressed during early canopy development from the TIR1/AFB auxin receptor, Aux/IAA auxin co-receptor, and ARF auxin response factor gene families in soybean, using principal component analysis. We defined Arabidopsis thaliana and model legume species orthologs for each soybean gene in these families allowing us to speculate potential soybean phenotypes based on well-characterized mutants in these model species. In future work, we aim to connect genetic and functional diversity in these candidate genes with phenotypic diversity in planta allowing for improvements in soybean rapid canopy cover, yield, and weed suppression. Further development of this and similar algorithms for defining and quantifying tissue- and phenotype-specificity in gene expression may allow expansion of diversity in valuable phenotypes in important crops. 
    more » « less