skip to main content

Title: GWAS Based on RNA-Seq SNPs and High-Throughput Phenotyping Combined with Climatic Data Highlights the Reservoir of Valuable Genetic Diversity in Regional Tomato Landraces
Tomato (Solanum lycopersicum L.) is a widely used model plant species for dissecting out the genomic bases of complex traits to thus provide an optimal platform for modern “-omics” studies and genome-guided breeding. Genome-wide association studies (GWAS) have become a preferred approach for screening large diverse populations and many traits. Here, we present GWAS analysis of a collection of 115 landraces and 11 vintage and modern cultivars. A total of 26 conventional descriptors, 40 traits obtained by digital phenotyping, the fruit content of six carotenoids recorded at the early ripening (breaker) and red-ripe stages and 21 climate-related variables were analyzed in the context of genetic diversity monitored in the 126 accessions. The data obtained from thorough phenotyping and the SNP diversity revealed by sequencing of ripe fruit transcripts of 120 of the tomato accessions were jointly analyzed to determine which genomic regions are implicated in the expressed phenotypic variation. This study reveals that the use of fruit RNA-Seq SNP diversity is effective not only for identification of genomic regions that underlie variation in fruit traits, but also of variation related to additional plant traits and adaptive responses to climate variation. These results allowed validation of our approach because different marker-trait associations mapped on chromosomal regions where other candidate genes for the same traits were previously reported. In addition, previously uncharacterized chromosomal regions were targeted as potentially involved in the expression of variable phenotypes, thus demonstrating that our tomato collection is a precious reservoir of diversity and an excellent tool for gene discovery.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    A collection of 163 accessions, includingSolanum pimpinellifolium,Solanum lycopersicumvar.cerasiformeandSolanum lycopersicumvar.lycopersicum, was selected to represent the genetic and morphological variability of tomato at its centers of origin and domestication: Andean regions of Peru and Ecuador and Mesoamerica. The collection is enriched withS. lycopersicumvar.cerasiformefrom the Amazonian region that has not been analyzed previously nor used extensively. The collection has been morphologically characterized showing diversity for fruit, flower and vegetative traits. Their genomes were sequenced in the Varitome project and are publicly available ( The identified SNPs have been annotated with respect to their impact and a total number of 37,974 out of 19,364,146 SNPs have been described as high impact by the SnpEeff analysis. GWAS has shown associations for different traits, demonstrating the potential of this collection for this kind of analysis. We have not only identified known QTLs and genes, but also new regions associated with traits such as fruit color, number of flowers per inflorescence or inflorescence architecture. To speed up and facilitate the use of this information, F2 populations were constructed by crossing the whole collection with three different parents. This F2 collection is useful for testing SNPs identified by GWAs, selection sweeps or any other candidate gene. All data is available on Solanaceae Genomics Network and the accession and F2 seeds are freely available at COMAV and at TGRC genebanks. All these resources together make this collection a good candidate for genetic studies.

    more » « less
  2. Nathan Springer (Ed.)
    Methyl salicylate is an important inter- and intra-plant signaling molecule, but is deemed undesirable by humans when it accumulates to high levels in ripe fruits. Balancing the tradeoff between consumer satisfaction and overall plant health is challenging as the mechanisms regulating volatile levels have not yet been fully elucidated. In this study, we investigated the accumulation of methyl salicylate in ripe fruits of tomatoes that belong to the red-fruited clade. We determine the genetic diversity and the interaction of four known loci controlling methyl salicylate levels in ripe fruits. In addition to Non-Smoky Glucosyl Transferase 1 (NSGT1), we uncovered extensive genome structural variation (SV) at the Methylesterase (MES) locus. This locus contains four tandemly duplicated Methylesterase genes and genome sequence investigations at the locus identified nine distinct haplotypes. Based on gene expression and results from biparental crosses, functional and non-functional haplotypes for MES were identified. The combination of the non-functional MES haplotype 2 and the non-functional NSGT1 haplotype IV or V in a GWAS panel showed high methyl salicylate levels in ripe fruits, particularly in accessions from Ecuador, demonstrating a strong interaction between these two loci and suggesting an ecological advantage. The genetic variation at the other two known loci, Salicylic Acid Methyl Transferase 1 (SAMT1) and tomato UDP Glycosyl Transferase 5 (SlUGT5), did not explain volatile variation in the red-fruited tomato germplasm, suggesting a minor role in methyl salicylate production in red-fruited tomato. Lastly, we found that most heirloom and modern tomato accessions carried a functional MES and a non-functional NSGT1 haplotype, ensuring acceptable levels of methyl salicylate in fruits. Yet, future selection of the functional NSGT1 allele could potentially improve flavor in the modern germplasm. 
    more » « less
  3. Using a reliable and accurate method to phenotype disease incidence and severity is essential to unravel the complex genetic architecture of disease resistance in plants, and to develop disease resistant cultivars. Genome-wide association studies (GWAS) involve phenotyping large numbers of accessions, and have been used for a myriad of traits. In field studies, genetic accessions are phenotyped across multiple environments and replications, which takes a significant amount of labor and resources. Deep Learning (DL) techniques can be effective for analyzing image-based tasks; thus DL methods are becoming more routine for phenotyping traits to save time and effort. This research aims to conduct GWAS on sudden death syndrome (SDS) of soybean [ Glycine max L. (Merr.)] using disease severity from both visual field ratings and DL-based (using images) severity ratings collected from 473 accessions. Images were processed through a DL framework that identified soybean leaflets with SDS symptoms, and then quantified the disease severity on those leaflets into a few classes with mean Average Precision of 0.34 on unseen test data. Both visual field ratings and image-based ratings identified significant single nucleotide polymorphism (SNP) markers associated with disease resistance. These significant SNP markers are either in the proximity of previously reported candidate genes for SDS or near potentially novel candidate genes. Four previously reported SDS QTL were identified that contained a significant SNPs, from this study, from both a visual field rating and an image-based rating. The results of this study provide an exciting avenue of using DL to capture complex phenotypic traits from images to get comparable or more insightful results compared to subjective visual field phenotyping of traits for disease symptoms. 
    more » « less
  4. Abstract Effective utilization of wild relatives is key to overcoming challenges in genetic improvement of cultivated tomato, which has a narrow genetic basis; however, current efforts to decipher high-quality genomes for tomato wild species are insufficient. Here, we report chromosome-scale tomato genomes from nine wild species and two cultivated accessions, representative of Solanum section Lycopersicon , the tomato clade. Together with two previously released genomes, we elucidate the phylogeny of Lycopersicon and construct a section-wide gene repertoire. We reveal the landscape of structural variants and provide entry to the genomic diversity among tomato wild relatives, enabling the discovery of a wild tomato gene with the potential to increase yields of modern cultivated tomatoes. Construction of a graph-based genome enables structural-variant-based genome-wide association studies, identifying numerous signals associated with tomato flavor-related traits and fruit metabolites. The tomato super-pangenome resources will expedite biological studies and breeding of this globally important crop. 
    more » « less
  5. Abstract

    Advancements in the use of genome‐wide markers have provided unprecedented opportunities for dissecting the genetic components that control phenotypic trait variation. However, cost‐effectively characterizing agronomically important phenotypic traits on a large scale remains a bottleneck. Unmanned aerial vehicle (UAV)‐based high‐throughput phenotyping has recently become a prominent method, as it allows large numbers of plants to be analyzed in a time‐series manner. In this experiment, 233 inbred lines from the maize (Zea maysL.) diversity panel were grown in the field under different nitrogen treatments. Unmanned aerial vehicle images were collected during different plant developmental stages throughout the growing season. A workflow for extracting plot‐level images, filtering images to remove nonfoliage elements, and calculating canopy coverage and greenness ratings based on vegetation indices (VIs) was developed. After applying the workflow, about 100,000 plot‐level image clips were obtained for 12 different time points. High correlations were detected between VIs and ground truth physiological and yield‐related traits. The genome‐wide association study was performed, resulting inn = 29 unique genomic regions associated with image extracted traits from two or more of the 12 total time points. A candidate geneZm00001d031997, a maize homolog of theArabidopsis HCF244(high chlorophyll fluorescence 244), located underneath the leading single nucleotide polymorphisms of the canopy coverage associated signals were repeatedly detected under both nitrogen conditions. The plot‐level time‐series phenotypic data and the trait‐associated genes provide great opportunities to advance plant science and to facilitate plant breeding.

    more » « less