skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.
Attention:The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 7:00 AM ET to 7:30 AM ET on Friday, April 24 due to maintenance. We apologize for the inconvenience.


Title: Soybean maturity prediction using two‐dimensional contour plots from drone‐based time series imagery
Abstract Plant breeding programs require assessment and understanding of days to maturity for accurate selection and placement of entries in appropriate tests. Soybean [Glycine max(L.) Merr.] breeding programs, in the early stages of the breeding pipeline, assign relative maturity ratings to experimental varieties that indicate their suitable maturity zones. Traditionally, the estimation of maturity rating value has involved breeders manually inspecting fields and assessing maturity value visually. This approach relies heavily on expert judgment, making it subjective and demanding considerable time and effort. This study aimed to develop a machine learning (ML) model for evaluating soybean maturity using uncrewed aerial system (UAS)–based time series imagery. Images were captured at 3‐day intervals, beginning as the earliest varieties started maturing and continuing until the last varieties fully matured. The data collected for this experiment consisted of 22,043 plots collected across 3 years and represent relative maturity groups 1.6–3.9. We utilized contour plot images extracted from the time series UAS imagery as input for a neural network model. This contour plot approach encoded the temporal and spatial variation within each plot into a single image. A deep learning model was trained to utilize this contour plot to predict maturity ratings. This model demonstrates a significant improvement in accuracy and robustness, achieving up to 85% accuracy. The predictive model offers a scalable, objective, and efficient means of assessing crop maturity, enabling phenomics and ML approaches to reduce the reliance on manual inspection and subjective assessment, thereby saving time and resources in a breeding program.  more » « less
Award ID(s):
1954556
PAR ID:
10673221
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  
Publisher / Repository:
The Plant Phenome Journal
Date Published:
Journal Name:
The Plant Phenome Journal
Volume:
8
Issue:
1
ISSN:
2578-2703
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Reliable seed yield estimation is an indispensable step in plant breeding programs geared towards cultivar development in major row crops. The objective of this study is to develop a machine learning (ML) approach adept at soybean ( Glycine max L. (Merr.)) pod counting to enable genotype seed yield rank prediction from in-field video data collected by a ground robot. To meet this goal, we developed a multiview image-based yield estimation framework utilizing deep learning architectures. Plant images captured from different angles were fused to estimate the yield and subsequently to rank soybean genotypes for application in breeding decisions. We used data from controlled imaging environment in field, as well as from plant breeding test plots in field to demonstrate the efficacy of our framework via comparing performance with manual pod counting and yield estimation. Our results demonstrate the promise of ML models in making breeding decisions with significant reduction of time and human effort and opening new breeding method avenues to develop cultivars. 
    more » « less
  2. Abstract Developments in genomics and phenomics have provided valuable tools for use in cultivar development. Genomic prediction (GP) has been used in commercial soybean [Glycine maxL. (Merr.)] breeding programs to predict grain yield and seed composition traits. Phenomic prediction (PP) is a rapidly developing field that holds the potential to be used for the selection of genotypes early in the growing season. The objectives of this study were to compare the performance of GP and PP for predicting soybean seed yield, protein, and oil. We additionally conducted genome‐wide association studies (GWAS) to identify significant single‐nucleotide polymorphisms (SNPs) associated with the traits of interest. The GWAS panel of 292 diverse accessions was grown in six environments in replicated trials. Spectral data were collected at two time points during the growing season. A genomic best linear unbiased prediction (GBLUP) model was trained on 269 accessions, while three separate machine learning (ML) models were trained on vegetation indices (VIs) and canopy traits. We observed that PP had a higher correlation coefficient than GP for seed yield, while GP had higher correlation coefficients for seed protein and oil contents. VIs with high feature importance were used as covariates in a new GBLUP model, and a new random forest model was trained with the inclusion of selected SNPs. These models did not outperform the original GP and PP models. These results show the capability of using ML for in‐season predictions for specific traits in soybean breeding and provide insights on PP and GP inclusions in breeding programs. 
    more » « less
  3. Accurate in-season prediction of seed yield and seed composition traits such as oil and protein are useful for gaining accuracy and efficiency in soybean breeding. These predictions can also inform farmers, enabling them to improve their field management practices, and guide their market decisions. We report a Transformer-based deep learning framework built on 30 years of multi-environment performance data from the Northern and Southern Uniform Soybean Tests (UST) across North America. Unlike earlier studies on seed yield, oil and protein prediction that focus on limited years, regions, single modalities, we utilized a comprehensive dataset that includes weather, genotype, and management factors, ensuring a more holistic approach to soybean yield, oil, and protein prediction. Our model integrates multivariate time-series weather data with genotypic relationship information, maturity group, and geographic location, to predict variety performance in diverse environments. Our model captures complex temporal patterns associated with trait variability; showing high predictive accuracy (R2) of 77.6 ± 0.2%, 63.9 ± 4.7%, and 79.3 ± 2.3% for seed yield, oil, and protein, respectively. Additionally, for seed yield, we also evaluated multiple interpretability methods to assess feature importance for predictor variables and critical growing timepoints, and solar radiation and temperature were noted as the key predictors. Overall, these results demonstrate the usefulness of a Transformer-based model in trait predictions, and the utility of large cooperative datasets from breeding programs. 
    more » « less
  4. null (Ed.)
    Cost-effective phenotyping methods are urgently needed to advance crop genetics in order to meet the food, fuel, and fiber demands of the coming decades. Concretely, characterizing plot level traits in fields is of particular interest. Recent developments in high-resolution imaging sensors for UAS (unmanned aerial systems) focused on collecting detailed phenotypic measurements are a potential solution. We introduce canopy roughness as a new plant plot-level trait. We tested its usability with soybean by optical data collected from UAS to estimate biomass. We validate canopy roughness on a panel of 108 soybean [Glycine max (L.) Merr.] recombinant inbred lines in a multienvironment trial during the R2 growth stage. A senseFly eBee UAS platform obtained aerial images with a senseFly S.O.D.A. compact digital camera. Using a structure from motion (SfM) technique, we reconstructed 3D point clouds of the soybean experiment. A novel pipeline for feature extraction was developed to compute canopy roughness from point clouds. We used regression analysis to correlate canopy roughness with field-measured aboveground biomass (AGB) with a leave-one-out cross-validation. Overall, our models achieved a coefficient of determination ( R 2 ) greater than 0.5 in all trials. Moreover, we found that canopy roughness has the ability to discern AGB variations among different genotypes. Our test trials demonstrate the potential of canopy roughness as a reliable trait for high-throughput phenotyping to estimate AGB. As such, canopy roughness provides practical information to breeders in order to select phenotypes on the basis of UAS data. 
    more » « less
  5. We present a novel method for soybean [Glycine max(L.) Merr.] yield estimation leveraging high-throughput seed counting via computer vision and deep learning techniques. Traditional methods for collecting yield data are labor-intensive, costly, and prone to equipment failures at critical data collection times and require transportation of equipment across field sites. Computer vision, the field of teaching computers to interpret visual data, allows us to extract detailed yield information directly from images. By treating it as a computer vision task, we report a more efficient alternative, employing a ground robot equipped with fisheye cameras to capture comprehensive videos of soybean plots from which images are extracted in a variety of development programs. These images are processed through the P2PNet-Yield model, a deep learning framework, where we combined a feature extraction module (the backbone of the P2PNet-Soy) and a yield regression module to estimate seed yields of soybean plots. Our results are built on 2 years of yield testing plot data—8,500 plots in 2021 and 650 plots in 2023. With these datasets, our approach incorporates several innovations to further improve the accuracy and generalizability of the seed counting and yield estimation architecture, such as the fisheye image correction and data augmentation with random sensor effects. The P2PNet-Yield model achieved a genotype ranking accuracy score of up to 83%. It demonstrates up to a 32% reduction in time to collect yield data as well as costs associated with traditional yield estimation, offering a scalable solution for breeding programs and agricultural productivity enhancement. 
    more » « less