skip to main content


Title: Deep learning-based phenotyping for genome wide association studies of sudden death syndrome in soybean
Using a reliable and accurate method to phenotype disease incidence and severity is essential to unravel the complex genetic architecture of disease resistance in plants, and to develop disease resistant cultivars. Genome-wide association studies (GWAS) involve phenotyping large numbers of accessions, and have been used for a myriad of traits. In field studies, genetic accessions are phenotyped across multiple environments and replications, which takes a significant amount of labor and resources. Deep Learning (DL) techniques can be effective for analyzing image-based tasks; thus DL methods are becoming more routine for phenotyping traits to save time and effort. This research aims to conduct GWAS on sudden death syndrome (SDS) of soybean [ Glycine max L. (Merr.)] using disease severity from both visual field ratings and DL-based (using images) severity ratings collected from 473 accessions. Images were processed through a DL framework that identified soybean leaflets with SDS symptoms, and then quantified the disease severity on those leaflets into a few classes with mean Average Precision of 0.34 on unseen test data. Both visual field ratings and image-based ratings identified significant single nucleotide polymorphism (SNP) markers associated with disease resistance. These significant SNP markers are either in the proximity of previously reported candidate genes for SDS or near potentially novel candidate genes. Four previously reported SDS QTL were identified that contained a significant SNPs, from this study, from both a visual field rating and an image-based rating. The results of this study provide an exciting avenue of using DL to capture complex phenotypic traits from images to get comparable or more insightful results compared to subjective visual field phenotyping of traits for disease symptoms.  more » « less
Award ID(s):
1954556
NSF-PAR ID:
10403616
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Frontiers in Plant Science
Volume:
13
ISSN:
1664-462X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Advancements in the use of genome‐wide markers have provided unprecedented opportunities for dissecting the genetic components that control phenotypic trait variation. However, cost‐effectively characterizing agronomically important phenotypic traits on a large scale remains a bottleneck. Unmanned aerial vehicle (UAV)‐based high‐throughput phenotyping has recently become a prominent method, as it allows large numbers of plants to be analyzed in a time‐series manner. In this experiment, 233 inbred lines from the maize (Zea maysL.) diversity panel were grown in the field under different nitrogen treatments. Unmanned aerial vehicle images were collected during different plant developmental stages throughout the growing season. A workflow for extracting plot‐level images, filtering images to remove nonfoliage elements, and calculating canopy coverage and greenness ratings based on vegetation indices (VIs) was developed. After applying the workflow, about 100,000 plot‐level image clips were obtained for 12 different time points. High correlations were detected between VIs and ground truth physiological and yield‐related traits. The genome‐wide association study was performed, resulting inn = 29 unique genomic regions associated with image extracted traits from two or more of the 12 total time points. A candidate geneZm00001d031997, a maize homolog of theArabidopsis HCF244(high chlorophyll fluorescence 244), located underneath the leading single nucleotide polymorphisms of the canopy coverage associated signals were repeatedly detected under both nitrogen conditions. The plot‐level time‐series phenotypic data and the trait‐associated genes provide great opportunities to advance plant science and to facilitate plant breeding.

     
    more » « less
  2. null (Ed.)
    Tomato (Solanum lycopersicum L.) is a widely used model plant species for dissecting out the genomic bases of complex traits to thus provide an optimal platform for modern “-omics” studies and genome-guided breeding. Genome-wide association studies (GWAS) have become a preferred approach for screening large diverse populations and many traits. Here, we present GWAS analysis of a collection of 115 landraces and 11 vintage and modern cultivars. A total of 26 conventional descriptors, 40 traits obtained by digital phenotyping, the fruit content of six carotenoids recorded at the early ripening (breaker) and red-ripe stages and 21 climate-related variables were analyzed in the context of genetic diversity monitored in the 126 accessions. The data obtained from thorough phenotyping and the SNP diversity revealed by sequencing of ripe fruit transcripts of 120 of the tomato accessions were jointly analyzed to determine which genomic regions are implicated in the expressed phenotypic variation. This study reveals that the use of fruit RNA-Seq SNP diversity is effective not only for identification of genomic regions that underlie variation in fruit traits, but also of variation related to additional plant traits and adaptive responses to climate variation. These results allowed validation of our approach because different marker-trait associations mapped on chromosomal regions where other candidate genes for the same traits were previously reported. In addition, previously uncharacterized chromosomal regions were targeted as potentially involved in the expression of variable phenotypes, thus demonstrating that our tomato collection is a precious reservoir of diversity and an excellent tool for gene discovery. 
    more » « less
  3. Abstract

    Mungbean (Vigna radiata(L.) Wizcek) is an important pulse crop, increasingly used as a source of protein, fiber, low fat, carbohydrates, minerals, and bioactive compounds in human diets. Mungbean is a dicot plant with trifoliate leaves. The primary component of many plant functions, including photosynthesis, light interception, and canopy structure, are leaves. The objectives were to investigate leaf morphological attributes, use image analysis to extract leaf morphological traits from photos from the Iowa Mungbean Diversity (IMD) panel, create a regression model to predict leaflet area, and undertake association mapping. We collected over 5000 leaf images of the IMD panel consisting of 484 accessions over 2 years (2020 and 2021) with two replications per experiment. Leaf traits were extracted using image analysis, analyzed, and used for association mapping. Morphological diversity included leaflet type (oval or lobed), leaflet size (small, medium, large), lobed angle (shallow, deep), and vein coloration (green, purple). A regression model was developed to predict each ovate leaflet's area (adjustedR2 = 0.97; residual standard errors of < = 1.10). The candidate genesVradi01g07560,Vradi05g01240,Vradi02g05730, andVradi03g00440are associated with multiple traits (length, width, perimeter, and area) across the leaflets (left, terminal, and right). These are suitable candidate genes for further investigation in their role in leaf development, growth, and function. Future studies will be needed to correlate the observed traits discussed here with yield or important agronomic traits for use as phenotypic or genotypic markers in marker‐aided selection methods for mungbean crop improvement.

     
    more » « less
  4. Abstract Background Alzheimer’s disease (AD) is a complex neurodegenerative disorder and the most common type of dementia. AD is characterized by a decline of cognitive function and brain atrophy, and is highly heritable with estimated heritability ranging from 60 to 80 $$\%$$ % . The most straightforward and widely used strategy to identify AD genetic basis is to perform genome-wide association study (GWAS) of the case-control diagnostic status. These GWAS studies have identified over 50 AD related susceptibility loci. Recently, imaging genetics has emerged as a new field where brain imaging measures are studied as quantitative traits to detect genetic factors. Given that many imaging genetics studies did not involve the diagnostic outcome in the analysis, the identified imaging or genetic markers may not be related or specific to the disease outcome. Results We propose a novel method to identify disease-related genetic variants enriched by imaging endophenotypes, which are the imaging traits associated with both genetic factors and disease status. Our analysis consists of three steps: (1) map the effects of a genetic variant (e.g., single nucleotide polymorphism or SNP) onto imaging traits across the brain using a linear regression model, (2) map the effects of a diagnosis phenotype onto imaging traits across the brain using a linear regression model, and (3) detect SNP-diagnosis association via correlating the SNP effects with the diagnostic effects on the brain-wide imaging traits. We demonstrate the promise of our approach by applying it to the Alzheimer’s Disease Neuroimaging Initiative database. Among 54 AD related susceptibility loci reported in prior large-scale AD GWAS, our approach identifies 41 of those from a much smaller study cohort while the standard association approaches identify only two of those. Clearly, the proposed imaging endophenotype enriched approach can reveal promising AD genetic variants undetectable using the traditional method. Conclusion We have proposed a novel method to identify AD genetic variants enriched by brain-wide imaging endophenotypes. This approach can not only boost detection power, but also reveal interesting biological pathways from genetic determinants to intermediate brain traits and to phenotypic AD outcomes. 
    more » « less
  5. Abstract

    Cotton bacterial blight (CBB), caused by the pathogenXanthomonas citrisubsp.malvacearum(Xcm), can inflict significant damage to cotton (Gossypium hirsutumL.) production. Previously, we identified and mapped the broad‐spectrum CBB‐resistant locusBB‐13on the long arm of chromosome D02 using array‐based single nucleotide polymorphisms (SNPs). In the current study, linked SNPs were converted into easily assayable Kompetitive Allele‐Specific PCR (KASP) markers to enable efficient detection and marker‐assisted selection of alleles at theBB‐13locus. The KASP marker's efficiency in detecting theBB‐13resistant gene was validated using an Upland cotton diversity panel of 72 accessions phenotyped withXcmrace 18. The KASP marker NCBB‐KASP4, derived from the CottonSNP63K array‐based marker i25755Gh that is closely associated withBB‐13, predicted the CBB response phenotypes with an error rate of 4.17% in the diversity panel. Additionally, two independent biparental recombinant inbred line populations segregating for resistance toXcmrace 18 were used for KASP marker validation and to test their utility in detecting the presence of theBB‐13locus in the resistant accession CABD3CABCH‐1‐89. NCBB‐KASP4, validated across breeding populations and broad germplasm, is a reliable KASP marker for detection and testing ofBB‐13locus in cotton. Further, diagnostic array‐based SNP marker i25755Gh's allele pattern and the potential CBB response is described for 875Gossypiumaccessions. These SNP‐based phenotypic predictions for 875 accessions along with disease response phenotypes toXcmrace 18 for 253 accessions provide a reference for CBB resistance in diverse cotton germplasm in the United States.

     
    more » « less