skip to main content


Title: A super-pangenome of the North American wild grape species
Abstract Background

Capturing the genetic diversity of wild relatives is crucial for improving crops because wild species are valuable sources of agronomic traits that are essential to enhance the sustainability and adaptability of domesticated cultivars. Genetic diversity across a genus can be captured in super-pangenomes, which provide a framework for interpreting genomic variations.

Results

Here we report the sequencing, assembly, and annotation of nine wild North American grape genomes, which are phased and scaffolded at chromosome scale. We generate a reference-unbiased super-pangenome using pairwise whole-genome alignment methods, revealing the extent of the genomic diversity among wild grape species from sequence to gene level. The pangenome graph captures genomic variation between haplotypes within a species and across the different species, and it accurately assesses the similarity of hybrids to their parents. The species selected to build the pangenome are a great representation of the genus, as illustrated by capturing known allelic variants in the sex-determining region and for Pierce’s disease resistance loci. Using pangenome-wide association analysis, we demonstrate the utility of the super-pangenome by effectively mapping short reads from genus-wide samples and identifying loci associated with salt tolerance in natural populations of grapes.

Conclusions

This study highlights how a reference-unbiased super-pangenome can reveal the genetic basis of adaptive traits from wild relatives and accelerate crop breeding research.

 
more » « less
NSF-PAR ID:
10480490
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
Genome Biology
Volume:
24
Issue:
1
ISSN:
1474-760X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Effective utilization of wild relatives is key to overcoming challenges in genetic improvement of cultivated tomato, which has a narrow genetic basis; however, current efforts to decipher high-quality genomes for tomato wild species are insufficient. Here, we report chromosome-scale tomato genomes from nine wild species and two cultivated accessions, representative of Solanum section Lycopersicon , the tomato clade. Together with two previously released genomes, we elucidate the phylogeny of Lycopersicon and construct a section-wide gene repertoire. We reveal the landscape of structural variants and provide entry to the genomic diversity among tomato wild relatives, enabling the discovery of a wild tomato gene with the potential to increase yields of modern cultivated tomatoes. Construction of a graph-based genome enables structural-variant-based genome-wide association studies, identifying numerous signals associated with tomato flavor-related traits and fruit metabolites. The tomato super-pangenome resources will expedite biological studies and breeding of this globally important crop. 
    more » « less
  2. Abstract Background

    Genome wide association (GWA) studies demonstrate linkages between genetic variants and traits of interest. Here, we tested associations between single nucleotide polymorphisms (SNPs) in rice (Oryza sativa) and two root hair traits, root hair length (RHL) and root hair density (RHD). Root hairs are outgrowths of single cells on the root epidermis that aid in nutrient and water acquisition and have also served as a model system to study cell differentiation and tip growth. Using lines from the Rice Diversity Panel-1, we explored the diversity of root hair length and density across four subpopulations of rice (aus,indica,temperate japonica, andtropical japonica). GWA analysis was completed using the high-density rice array (HDRA) and the rice reference panel (RICE-RP) SNP sets.

    Results

    We identified 18 genomic regions related to root hair traits, 14 of which related to RHD and four to RHL. No genomic regions were significantly associated with both traits. Two regions overlapped with previously identified quantitative trait loci (QTL) associated with root hair density in rice. We identified candidate genes in these regions and present those with previously published expression data relevant to root hair development. We re-phenotyped a subset of lines with extreme RHD phenotypes and found that the variation in RHD was due to differences in cell differentiation, not cell size, indicating genes in an associated genomic region may influence root hair cell fate. The candidate genes that we identified showed little overlap with previously characterized genes in rice andArabidopsis.

    Conclusions

    Root hair length and density are quantitative traits with complex and independent genetic control in rice. The genomic regions described here could be used as the basis for QTL development and further analysis of the genetic control of root hair length and density. We present a list of candidate genes involved in root hair formation and growth in rice, many of which have not been previously identified as having a relation to root hair growth. Since little is known about root hair growth in grasses, these provide a guide for further research and crop improvement.

     
    more » « less
  3. Abstract

    Agricultural weeds serve as productive models for studying the genetic basis of rapid adaptation, with weed‐adaptive traits potentially evolving multiple times independently in geographically distinct but environmentally similar agroecosystems. Weedy relatives of domesticated crops can be especially interesting systems because of the potential for weed‐adaptive alleles to originate through multiple mechanisms, including introgression from cultivated and/or wild relatives, standing genetic variation, and de novo mutations. Weedy rice populations have evolved multiple times through dedomestication from cultivated rice. Much of the genomic work to date in weedy rice has focused on populations that exist outside the range of the wild crop progenitor. In this study, we use genome‐wide SNPs generated through genotyping‐by‐sequencing to compare the evolution of weedy rice in regions outside the range of wild rice (North America, South Korea) and populations in Southeast Asia, where wild rice populations are present. We find evidence for adaptive introgression of wild rice alleles into weedy rice populations in Southeast Asia, with the relative contributions of wild and cultivated rice alleles varying across the genome. In addition, gene regions underlying several weed‐adaptive traits are dominated by genomic contributions from wild rice. Genome‐wide nucleotide diversity is also much higher in Southeast Asian weeds than in North American and South Korean weeds. Besides reflecting introgression from wild rice, this difference in diversity likely reflects genetic contributions from diverse cultivated landraces that may have served as the progenitors of these weedy populations. These important differences in weedy rice evolution in regions with and without wild rice could inform region‐specific management strategies for weed control.

     
    more » « less
  4. Abstract Background

    The Aldabra giant tortoise (Aldabrachelys gigantea) is one of only two giant tortoise species left in the world. The species is endemic to Aldabra Atoll in Seychelles and is listed as Vulnerable on the International Union for Conservation of Nature Red List (v2.3) due to its limited distribution and threats posed by climate change. Genomic resources for A. gigantea are lacking, hampering conservation efforts for both wild and ex situpopulations. A high-quality genome would also open avenues to investigate the genetic basis of the species’ exceptionally long life span.

    Findings

    We produced the first chromosome-level de novo genome assembly of A. gigantea using PacBio High-Fidelity sequencing and high-throughput chromosome conformation capture. We produced a 2.37-Gbp assembly with a scaffold N50 of 148.6 Mbp and a resolution into 26 chromosomes. RNA sequencing–assisted gene model prediction identified 23,953 protein-coding genes and 1.1 Gbp of repetitive sequences. Synteny analyses among turtle genomes revealed high levels of chromosomal collinearity even among distantly related taxa. To assess the utility of the high-quality assembly for species conservation, we performed a low-coverage resequencing of 30 individuals from wild populations and two zoo individuals. Our genome-wide population structure analyses detected genetic population structure in the wild and identified the most likely origin of the zoo-housed individuals. We further identified putatively deleterious mutations to be monitored.

    Conclusions

    We establish a high-quality chromosome-level reference genome for A. gigantea and one of the most complete turtle genomes available. We show that low-coverage whole-genome resequencing, for which alignment to the reference genome is a necessity, is a powerful tool to assess the population structure of the wild population and reveal the geographic origins of ex situ individuals relevant for genetic diversity management and rewilding efforts.

     
    more » « less
  5. SUMMARY

    Wild relatives of tomato are a valuable source of natural variation in tomato breeding, as many can be hybridized to the cultivated species (Solanum lycopersicum). Several, includingSolanum lycopersicoides, have been crossed toS. lycopersicumfor the development of ordered introgression lines (ILs), facilitating breeding for desirable traits. Despite the utility of these wild relatives and their associated ILs, few finished genome sequences have been produced to aid genetic and genomic studies. Here we report a chromosome‐scale genome assembly forS. lycopersicoidesLA2951, which contains 37 938 predicted protein‐coding genes. With the aid of this genome assembly, we have precisely delimited the boundaries of theS. lycopersicoidesintrogressions in a set ofS. lycopersicumcv. VF36 × LA2951 ILs. We demonstrate the usefulness of the LA2951 genome by identifying several quantitative trait loci for phenolics and carotenoids, including underlying candidate genes, and by investigating the genome organization and immunity‐associated function of the clusteredPtogene family. In addition, syntenic analysis of R2R3MYB genes sheds light on the identity of theAuberginelocus underlying anthocyanin production. The genome sequence and IL map provide valuable resources for studying fruit nutrient/quality traits, pathogen resistance, and environmental stress tolerance. We present a new genome resource for the wild speciesS. lycopersicoides, which we use to shed light on theAuberginelocus responsible for anthocyanin production. We also provide IL boundary mappings, which facilitated identifying novel carotenoid quantitative trait loci of which one was likely driven by an uncharacterized lycopene β‐cyclase whose function we demonstrate.

     
    more » « less