skip to main content

Title: KAT4IA: K -Means Assisted Training for Image Analysis of Field-Grown Plant Phenotypes
High-throughput phenotyping enables the efficient collection of plant trait data at scale. One example involves using imaging systems over key phases of a crop growing season. Although the resulting images provide rich data for statistical analyses of plant phenotypes, image processing for trait extraction is required as a prerequisite. Current methods for trait extraction are mainly based on supervised learning with human labeled data or semisupervised learning with a mixture of human labeled data and unsupervised data. Unfortunately, preparing a sufficiently large training data is both time and labor-intensive. We describe a self-supervised pipeline (KAT4IA) that uses K -means clustering on greenhouse images to construct training data for extracting and analyzing plant traits from an image-based field phenotyping system. The KAT4IA pipeline includes these main steps: self-supervised training set construction, plant segmentation from images of field-grown plants, automatic separation of target plants, calculation of plant traits, and functional curve fitting of the extracted traits. To deal with the challenge of separating target plants from noisy backgrounds in field images, we describe a novel approach using row-cuts and column-cuts on images segmented by transform domain neural network learning, which utilizes plant pixels identified from greenhouse images to train a segmentation model more » for field images. This approach is efficient and does not require human intervention. Our results show that KAT4IA is able to accurately extract plant pixels and estimate plant heights. « less
; ; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Plant Phenomics
Page Range or eLocation-ID:
1 to 12
Sponsoring Org:
National Science Foundation
More Like this
  1. This study describes the evaluation of a range of approaches to semantic segmentation of hyperspectral images of sorghum plants, classifying each pixel as either nonplant or belonging to one of the three organ types (leaf, stalk, panicle). While many current methods for segmentation focus on separating plant pixels from background, organ-specific segmentation makes it feasible to measure a wider range of plant properties. Manually scored training data for a set of hyperspectral images collected from a sorghum association population was used to train and evaluate a set of supervised classification models. Many algorithms show acceptable accuracy for this classification task. Algorithms trained on sorghum data are able to accurately classify maize leaves and stalks, but fail to accurately classify maize reproductive organs which are not directly equivalent to sorghum panicles. Trait measurements extracted from semantic segmentation of sorghum organs can be used to identify both genes known to be controlling variation in a previously measured phenotypes (e.g., panicle size and plant height) as well as identify signals for genes controlling traits not previously quantified in this population (e.g., stalk/leaf ratio). Organ level semantic segmentation provides opportunities to identify genes controlling variation in a wide range of morphological phenotypes in sorghum,more »maize, and other related grain crops.« less
  2. Brassinosteroids (BRs) are a group of plant steroid hormones involved in regulating growth, development, and stress responses. Many components of the BR pathway have previously been identified and characterized. However, BR phenotyping experiments are typically performed on petri plates and/or in a low-throughput manner. Additionally, the BR pathway has extensive crosstalk with drought responses, but drought experiments are time-consuming and difficult to control. Thus, we developed Robotic Assay for Drought (RoAD) to perform BR and drought response experiments in soil-grown Arabidopsis plants. RoAD is equipped with a bench scale, a precisely controlled watering system, an RGB camera, and a laser profilometer. It performs daily weighing, watering, and imaging tasks and is capable of administering BR response assays by watering plants with Propiconazole (PCZ), a BR biosynthesis inhibitor. We developed image processing algorithms for both plant segmentation and phenotypic trait extraction in order to accurately measure traits in 2-dimensional (2D) and 3-dimensional (3D) spaces including plant surface area, leaf length, and leaf width. We then applied machine learning algorithms that utilized the extracted phenotypic parameters to identify image-derived traits that can distinguish control, drought, and PCZ-treated plants. We carried out PCZ and drought experiments on a set of BR mutants andmore »Arabidopsis accessions with altered BR responses. Finally, we extended the RoAD assays to perform BR response assays using PCZ in Zea mays (maize) plants. This study establishes an automated and non-invasive robotic imaging system as a tool to accurately measure morphological and growth-related traits of Arabidopsis and maize plants, providing insights into the BR-mediated control of plant growth and stress responses.« less
  3. Cost-effective phenotyping methods are urgently needed to advance crop genetics in order to meet the food, fuel, and fiber demands of the coming decades. Concretely, characterizing plot level traits in fields is of particular interest. Recent developments in high-resolution imaging sensors for UAS (unmanned aerial systems) focused on collecting detailed phenotypic measurements are a potential solution. We introduce canopy roughness as a new plant plot-level trait. We tested its usability with soybean by optical data collected from UAS to estimate biomass. We validate canopy roughness on a panel of 108 soybean [Glycine max (L.) Merr.] recombinant inbred lines in a multienvironment trial during the R2 growth stage. A senseFly eBee UAS platform obtained aerial images with a senseFly S.O.D.A. compact digital camera. Using a structure from motion (SfM) technique, we reconstructed 3D point clouds of the soybean experiment. A novel pipeline for feature extraction was developed to compute canopy roughness from point clouds. We used regression analysis to correlate canopy roughness with field-measured aboveground biomass (AGB) with a leave-one-out cross-validation. Overall, our models achieved a coefficient of determination ( R 2 ) greater than 0.5 in all trials. Moreover, we found that canopy roughness has the ability to discern AGBmore »variations among different genotypes. Our test trials demonstrate the potential of canopy roughness as a reliable trait for high-throughput phenotyping to estimate AGB. As such, canopy roughness provides practical information to breeders in order to select phenotypes on the basis of UAS data.« less
  4. The need for manual and detailed annotations limits the applicability of supervised deep learning algorithms in medical image analyses, specifically in the field of pathology. Semi-supervised learning (SSL) provides an effective way for leveraging unlabeled data to relieve the heavy reliance on the amount of labeled samples when training a model. Although SSL has shown good performance, the performance of recent state-of-the-art SSL methods on pathology images is still under study. The problem for selecting the most optimal data to label for SSL is not fully explored. To tackle this challenge, we propose a semi-supervised active learning framework with a region-based selection criterion. This framework iteratively selects regions for an-notation query to quickly expand the diversity and volume of the labeled set. We evaluate our framework on a grey-matter/white-matter segmentation problem using gigapixel pathology images from autopsied human brain tissues. With only 0.1% regions labeled, our proposed algorithm can reach a competitive IoU score compared to fully-supervised learning and outperform the current state-of-the-art SSL by more than 10% of IoU score and DICE coefficient.
  5. One significant challenge in the field of supervised deep learning is the lack of large-scale labeled datasets for many problems. In this paper, we propose Consensus Spectral Clustering (CSC), which leverages the strengths of convolutional autoencoders and spectral clustering to provide pseudo labels for image data. This data can be used as weakly-labeled data for training and evaluating classifiers which require supervision. The primary weaknesses of previous works lies in their inability to isolate the object of interest in an image and cluster similar images together. We address these issues by denoising input images to remove pixels which do not contain data pertinent to the target. Additionally, we introduce a voting method for label selection to improve the clustering results. Our extensive experimentation on several benchmark datasets demonstrates that the proposed CSC method achieves competitive performance with state-of-the-art methods.