skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Super-pangenome analyses highlight genomic diversity and structural variation across wild and cultivated tomato species
Abstract Effective utilization of wild relatives is key to overcoming challenges in genetic improvement of cultivated tomato, which has a narrow genetic basis; however, current efforts to decipher high-quality genomes for tomato wild species are insufficient. Here, we report chromosome-scale tomato genomes from nine wild species and two cultivated accessions, representative of Solanum section Lycopersicon , the tomato clade. Together with two previously released genomes, we elucidate the phylogeny of Lycopersicon and construct a section-wide gene repertoire. We reveal the landscape of structural variants and provide entry to the genomic diversity among tomato wild relatives, enabling the discovery of a wild tomato gene with the potential to increase yields of modern cultivated tomatoes. Construction of a graph-based genome enables structural-variant-based genome-wide association studies, identifying numerous signals associated with tomato flavor-related traits and fruit metabolites. The tomato super-pangenome resources will expedite biological studies and breeding of this globally important crop.  more » « less
Award ID(s):
1855585
PAR ID:
10410288
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; « less
Date Published:
Journal Name:
Nature Genetics
ISSN:
1061-4036
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. SUMMARY Wild relatives of tomato are a valuable source of natural variation in tomato breeding, as many can be hybridized to the cultivated species (Solanum lycopersicum). Several, includingSolanum lycopersicoides, have been crossed toS. lycopersicumfor the development of ordered introgression lines (ILs), facilitating breeding for desirable traits. Despite the utility of these wild relatives and their associated ILs, few finished genome sequences have been produced to aid genetic and genomic studies. Here we report a chromosome‐scale genome assembly forS. lycopersicoidesLA2951, which contains 37 938 predicted protein‐coding genes. With the aid of this genome assembly, we have precisely delimited the boundaries of theS. lycopersicoidesintrogressions in a set ofS. lycopersicumcv. VF36 × LA2951 ILs. We demonstrate the usefulness of the LA2951 genome by identifying several quantitative trait loci for phenolics and carotenoids, including underlying candidate genes, and by investigating the genome organization and immunity‐associated function of the clusteredPtogene family. In addition, syntenic analysis of R2R3MYB genes sheds light on the identity of theAuberginelocus underlying anthocyanin production. The genome sequence and IL map provide valuable resources for studying fruit nutrient/quality traits, pathogen resistance, and environmental stress tolerance. We present a new genome resource for the wild speciesS. lycopersicoides, which we use to shed light on theAuberginelocus responsible for anthocyanin production. We also provide IL boundary mappings, which facilitated identifying novel carotenoid quantitative trait loci of which one was likely driven by an uncharacterized lycopene β‐cyclase whose function we demonstrate. 
    more » « less
  2. Abstract Rye ( Secale cereale L.) is an exceptionally climate-resilient cereal crop, used extensively to produce improved wheat varieties via introgressive hybridization and possessing the entire repertoire of genes necessary to enable hybrid breeding. Rye is allogamous and only recently domesticated, thus giving cultivated ryes access to a diverse and exploitable wild gene pool. To further enhance the agronomic potential of rye, we produced a chromosome-scale annotated assembly of the 7.9-gigabase rye genome and extensively validated its quality by using a suite of molecular genetic resources. We demonstrate applications of this resource with a broad range of investigations. We present findings on cultivated rye’s incomplete genetic isolation from wild relatives, mechanisms of genome structural evolution, pathogen resistance, low-temperature tolerance, fertility control systems for hybrid breeding and the yield benefits of rye–wheat introgressions. 
    more » « less
  3. Abstract Missing heritability in genome-wide association studies defines a major problem in genetic analyses of complex biological traits 1,2 . The solution to this problem is to identify all causal genetic variants and to measure their individual contributions 3,4 . Here we report a graph pangenome of tomato constructed by precisely cataloguing more than 19 million variants from 838 genomes, including 32 new reference-level genome assemblies. This graph pangenome was used for genome-wide association study analyses and heritability estimation of 20,323 gene-expression and metabolite traits. The average estimated trait heritability is 0.41 compared with 0.33 when using the single linear reference genome. This 24% increase in estimated heritability is largely due to resolving incomplete linkage disequilibrium through the inclusion of additional causal structural variants identified using the graph pangenome. Moreover, by resolving allelic and locus heterogeneity, structural variants improve the power to identify genetic factors underlying agronomically important traits leading to, for example, the identification of two new genes potentially contributing to soluble solid content. The newly identified structural variants will facilitate genetic improvement of tomato through both marker-assisted selection and genomic selection. Our study advances the understanding of the heritability of complex traits and demonstrates the power of the graph pangenome in crop breeding. 
    more » « less
  4. PremiseCommon taxonomic practices, which condition species' descriptions on diagnostic morphological traits, may systematically lump outcrossing species and unduly split selfing species. Specifically, higher effective population sizes and genetic diversity of obligate outcrossers are expected to result less reliable phenotypic diagnoses. Wild tomatoes, members ofSolanumsect.Lycopersicum, are commonly used as a source of exotic germplasm for improvement of the cultivated tomato, and are increasingly employed in basic research. Although the section experienced significant early work, which continues presently, the taxonomic status of many wild species has undergone a number of significant revisions and remains uncertain. Species in this section vary in their breeding systems, notably the expression of self‐incompatibility, which determines individual propensity for outcrossing MethodsHere, we examine the taxonomic status of obligately outcrossing Chilean wild tomato (Solanum chilense) using reduced‐representation sequencing (RAD‐seq), a range of phylogenetic and population genetic analyses, as well as analyses of crossing and morphological data. ResultsOverall, each of our analyses provides a considerable weight of evidence that the Pacific coastal populations and Andean inland populations of the currently describedSolanum chilenserepresent separately evolving populations, and conceal at least one undescribed cryptic species. ConclusionsDespite its vast economic importance,Solanumsect.Lycopersiconstill exhibits considerable taxonomic instability. A pattern of under‐recognition of outcrossing species may be common, not only in tomatoes, but across flowering plants. We discuss the possible causes and implications of this observation, with a focus on macroevolutionary inference. 
    more » « less
  5. Abstract BackgroundCapturing the genetic diversity of wild relatives is crucial for improving crops because wild species are valuable sources of agronomic traits that are essential to enhance the sustainability and adaptability of domesticated cultivars. Genetic diversity across a genus can be captured in super-pangenomes, which provide a framework for interpreting genomic variations. ResultsHere we report the sequencing, assembly, and annotation of nine wild North American grape genomes, which are phased and scaffolded at chromosome scale. We generate a reference-unbiased super-pangenome using pairwise whole-genome alignment methods, revealing the extent of the genomic diversity among wild grape species from sequence to gene level. The pangenome graph captures genomic variation between haplotypes within a species and across the different species, and it accurately assesses the similarity of hybrids to their parents. The species selected to build the pangenome are a great representation of the genus, as illustrated by capturing known allelic variants in the sex-determining region and for Pierce’s disease resistance loci. Using pangenome-wide association analysis, we demonstrate the utility of the super-pangenome by effectively mapping short reads from genus-wide samples and identifying loci associated with salt tolerance in natural populations of grapes. ConclusionsThis study highlights how a reference-unbiased super-pangenome can reveal the genetic basis of adaptive traits from wild relatives and accelerate crop breeding research. 
    more » « less