skip to main content

This content will become publicly available on October 21, 2023

Title: Deep learning-based phenotyping for genome wide association studies of sudden death syndrome in soybean
Using a reliable and accurate method to phenotype disease incidence and severity is essential to unravel the complex genetic architecture of disease resistance in plants, and to develop disease resistant cultivars. Genome-wide association studies (GWAS) involve phenotyping large numbers of accessions, and have been used for a myriad of traits. In field studies, genetic accessions are phenotyped across multiple environments and replications, which takes a significant amount of labor and resources. Deep Learning (DL) techniques can be effective for analyzing image-based tasks; thus DL methods are becoming more routine for phenotyping traits to save time and effort. This research aims to conduct GWAS on sudden death syndrome (SDS) of soybean [ Glycine max L. (Merr.)] using disease severity from both visual field ratings and DL-based (using images) severity ratings collected from 473 accessions. Images were processed through a DL framework that identified soybean leaflets with SDS symptoms, and then quantified the disease severity on those leaflets into a few classes with mean Average Precision of 0.34 on unseen test data. Both visual field ratings and image-based ratings identified significant single nucleotide polymorphism (SNP) markers associated with disease resistance. These significant SNP markers are either in the proximity of previously reported more » candidate genes for SDS or near potentially novel candidate genes. Four previously reported SDS QTL were identified that contained a significant SNPs, from this study, from both a visual field rating and an image-based rating. The results of this study provide an exciting avenue of using DL to capture complex phenotypic traits from images to get comparable or more insightful results compared to subjective visual field phenotyping of traits for disease symptoms. « less
; ; ; ; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Frontiers in Plant Science
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Advancements in the use of genome‐wide markers have provided unprecedented opportunities for dissecting the genetic components that control phenotypic trait variation. However, cost‐effectively characterizing agronomically important phenotypic traits on a large scale remains a bottleneck. Unmanned aerial vehicle (UAV)‐based high‐throughput phenotyping has recently become a prominent method, as it allows large numbers of plants to be analyzed in a time‐series manner. In this experiment, 233 inbred lines from the maize (Zea maysL.) diversity panel were grown in the field under different nitrogen treatments. Unmanned aerial vehicle images were collected during different plant developmental stages throughout the growing season. A workflow for extracting plot‐level images, filtering images to remove nonfoliage elements, and calculating canopy coverage and greenness ratings based on vegetation indices (VIs) was developed. After applying the workflow, about 100,000 plot‐level image clips were obtained for 12 different time points. High correlations were detected between VIs and ground truth physiological and yield‐related traits. The genome‐wide association study was performed, resulting inn = 29 unique genomic regions associated with image extracted traits from two or more of the 12 total time points. A candidate geneZm00001d031997, a maize homolog of theArabidopsis HCF244(high chlorophyll fluorescence 244), located underneath the leading single nucleotide polymorphisms ofmore »the canopy coverage associated signals were repeatedly detected under both nitrogen conditions. The plot‐level time‐series phenotypic data and the trait‐associated genes provide great opportunities to advance plant science and to facilitate plant breeding.

    « less
  2. Tomato (Solanum lycopersicum L.) is a widely used model plant species for dissecting out the genomic bases of complex traits to thus provide an optimal platform for modern “-omics” studies and genome-guided breeding. Genome-wide association studies (GWAS) have become a preferred approach for screening large diverse populations and many traits. Here, we present GWAS analysis of a collection of 115 landraces and 11 vintage and modern cultivars. A total of 26 conventional descriptors, 40 traits obtained by digital phenotyping, the fruit content of six carotenoids recorded at the early ripening (breaker) and red-ripe stages and 21 climate-related variables were analyzed in the context of genetic diversity monitored in the 126 accessions. The data obtained from thorough phenotyping and the SNP diversity revealed by sequencing of ripe fruit transcripts of 120 of the tomato accessions were jointly analyzed to determine which genomic regions are implicated in the expressed phenotypic variation. This study reveals that the use of fruit RNA-Seq SNP diversity is effective not only for identification of genomic regions that underlie variation in fruit traits, but also of variation related to additional plant traits and adaptive responses to climate variation. These results allowed validation of our approach because different marker-traitmore »associations mapped on chromosomal regions where other candidate genes for the same traits were previously reported. In addition, previously uncharacterized chromosomal regions were targeted as potentially involved in the expression of variable phenotypes, thus demonstrating that our tomato collection is a precious reservoir of diversity and an excellent tool for gene discovery.« less
  3. Abstract Background Alzheimer’s disease (AD) is a complex neurodegenerative disorder and the most common type of dementia. AD is characterized by a decline of cognitive function and brain atrophy, and is highly heritable with estimated heritability ranging from 60 to 80 $$\%$$ % . The most straightforward and widely used strategy to identify AD genetic basis is to perform genome-wide association study (GWAS) of the case-control diagnostic status. These GWAS studies have identified over 50 AD related susceptibility loci. Recently, imaging genetics has emerged as a new field where brain imaging measures are studied as quantitative traits to detect genetic factors. Given that many imaging genetics studies did not involve the diagnostic outcome in the analysis, the identified imaging or genetic markers may not be related or specific to the disease outcome. Results We propose a novel method to identify disease-related genetic variants enriched by imaging endophenotypes, which are the imaging traits associated with both genetic factors and disease status. Our analysis consists of three steps: (1) map the effects of a genetic variant (e.g., single nucleotide polymorphism or SNP) onto imaging traits across the brain using a linear regression model, (2) map the effects of a diagnosis phenotype ontomore »imaging traits across the brain using a linear regression model, and (3) detect SNP-diagnosis association via correlating the SNP effects with the diagnostic effects on the brain-wide imaging traits. We demonstrate the promise of our approach by applying it to the Alzheimer’s Disease Neuroimaging Initiative database. Among 54 AD related susceptibility loci reported in prior large-scale AD GWAS, our approach identifies 41 of those from a much smaller study cohort while the standard association approaches identify only two of those. Clearly, the proposed imaging endophenotype enriched approach can reveal promising AD genetic variants undetectable using the traditional method. Conclusion We have proposed a novel method to identify AD genetic variants enriched by brain-wide imaging endophenotypes. This approach can not only boost detection power, but also reveal interesting biological pathways from genetic determinants to intermediate brain traits and to phenotypic AD outcomes.« less
  4. Abstract

    Cotton bacterial blight (CBB), caused by the pathogenXanthomonas citrisubsp.malvacearum(Xcm), can inflict significant damage to cotton (Gossypium hirsutumL.) production. Previously, we identified and mapped the broad‐spectrum CBB‐resistant locusBB‐13on the long arm of chromosome D02 using array‐based single nucleotide polymorphisms (SNPs). In the current study, linked SNPs were converted into easily assayable Kompetitive Allele‐Specific PCR (KASP) markers to enable efficient detection and marker‐assisted selection of alleles at theBB‐13locus. The KASP marker's efficiency in detecting theBB‐13resistant gene was validated using an Upland cotton diversity panel of 72 accessions phenotyped withXcmrace 18. The KASP marker NCBB‐KASP4, derived from the CottonSNP63K array‐based marker i25755Gh that is closely associated withBB‐13, predicted the CBB response phenotypes with an error rate of 4.17% in the diversity panel. Additionally, two independent biparental recombinant inbred line populations segregating for resistance toXcmrace 18 were used for KASP marker validation and to test their utility in detecting the presence of theBB‐13locus in the resistant accession CABD3CABCH‐1‐89. NCBB‐KASP4, validated across breeding populations and broad germplasm, is a reliable KASP marker for detection and testing ofBB‐13locus in cotton. Further, diagnostic array‐based SNP marker i25755Gh's allele pattern and the potential CBB response is described for 875Gossypiumaccessions. These SNP‐based phenotypic predictions for 875 accessions alongmore »with disease response phenotypes toXcmrace 18 for 253 accessions provide a reference for CBB resistance in diverse cotton germplasm in the United States.

    « less
  5. Abstract

    Heritability is well documented for psychiatric disorders and cognitive abilities which are, however, complex, involving both genetic and environmental factors. Hence, it remains challenging to discover which and how genetic variations contribute to such complex traits. In this article, they propose to use mediation analysis to bridge this gap, where neuroimaging phenotypes were utilized as intermediate variables. The Philadelphia Neurodevelopmental Cohort was investigated using genome‐wide association studies (GWAS) and mediation analyses. Specifically, 951 participants were included with age ranging from 8 to 21 years. Two hundred and four neuroimaging measures were extracted from structural magnetic resonance imaging scans. GWAS were conducted for each measure to evaluate the SNP‐based heritability. Furthermore, mediation analyses were employed to understand the mechanisms in which genetic variants have influence on pathological behaviors implicitly through neuroimaging phenotypes, and identified SNPs that would not be detected otherwise. Our analyses found that rs10494561, located in the intron region within NMNAT2, was associated with the severity of the prodromal symptoms of psychosis implicitly, mediated through the volume of the left hemisphere of the superior frontal region (). The gene NMNAT2 is known to be associated with brainstem degeneration, and produce cytoplasmic enzyme which is mainly expressedmore »in the brain. Another SNP rs2285351 was found in the intron region of gene IFT122 which may be potentially associated with human spatial orientation ability through the area of the left hemisphere of the isthmuscingulate region ().Hum Brain Mapp 38:4088–4097, 2017. ©2017 Wiley Periodicals, Inc.

    « less