skip to main content

Title: Artificial Intelligence and Satellite Based Remote Sensing can be used to Predict Soybean (Glycine max) Yield
Abstract Because the manual counting of soybean ( Glycine max ) plants, pods, and seeds/pods is unsuitable for soybean yield predictions, alternative methods are desired. Therefore, the objective was to determine if satellite remote sensing − based artificial intelligence (AI) models could be used to predict soybean yield. In the study, multiple remote sensing − based AI models were developed for soybean growth stage ranging from VE/VC (plant emergence) to R6/R7 (full seed to beginning maturity). The ability of the Deep Neural Network (DNN), Support Vector Machine (SVM), Random Forest (RF), Least Absolute Shrinkage and Selection Operator (LASSO), and AdaBoost to predict soybean yield, based on blue, green, red, and near infrared reflectance data collected by the PlanetScope satellite at 6 growth stages, was determined. Remote sensing and soybean yield monitor data from 3 different fields in two years (2019 and 2021) were aggregated into 24,282 grid cells that had the dimensions of 10 by 10m. A comparison across models showed that the DNN outperformed the other models. Moreover, as crops matured from VE/VC to R4/R5, the R 2 value of the models increased from 0.26 to over 0.70. These findings indicate that remote sensing data collected at different growth stages can be combined for soybean yield predictions. Moreover, additional work needs to be conducted to assess the model's ability to predict soybean yield with vegetation indices (VI) data for fields not used to train the model. This article is protected by copyright. All rights reserved  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
Agronomy Journal
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    With population growth and resource depletion, maximizing the efficiency of soybean (Glycine max [L.] Merr.) and rice (Oryza sativa L.) cropping systems is urgently needed. The goal of this study was to shed light on precise irrigation amounts and optimal agronomic practices via simulating rice–rice and soybean–rice crop rotations in the Agricultural Policy/Environmental eXtender (APEX) model. The APEX model was calibrated using observations from five fields under soybean–rice rotation in Arkansas from 2017 to 2019 and remote sensing leaf area index (LAI) values to assess modeled vegetation growth. Different irrigation practices were assessed, including conventional flooding (CVF), known as cascade, multiple inlet rice irrigation with polypipe (MIRI), and furrow irrigation (FIR). The amount of water used differed between fields, following each field’s measured or estimated input. Moreover, fields were managed with either continuous flooding (CF) or alternate wetting and drying (AWD) irrigation. Two 20-year scenarios were simulated to test yield changes: (1) between rice–rice and soybean–rice rotation and (2) under reduced irrigation amounts. After calibration with crop yield and LAI, the modeled LAI correlated to the observations with R2 values greater than 0.66, and the percent bias (PBIAS) values were within 32%. The PBIAS and percent difference for modeled versus observed yield were within 2.5% for rice and 15% for soybean. Contrary to expectation, the rice–rice and soybean–rice rotation yields were not statistically significant. The results of the reduced irrigation scenario differed by field, but reducing irrigation beyond 20% from the original amount input by the farmers significantly reduced yields in all fields, except for one field that was over-irrigated. 
    more » « less
  2. null (Ed.)
    Accurate, precise, and timely estimation of crop yield is key to a grower’s ability to proactively manage crop growth and predict harvest logistics. Such yield predictions typically are based on multi-parametric models and in-situ sampling. Here we investigate the extension of a greenhouse study, to low-altitude unmanned aerial systems (UAS). Our principal objective was to investigate snap bean crop (Phaseolus vulgaris) yield using imaging spectroscopy (hyperspectral imaging) in the visible to near-infrared (VNIR; 400–1000 nm) region via UAS. We aimed to solve the problem of crop yield modelling by identifying spectral features explaining yield and evaluating the best time period for accurate yield prediction, early in time. We introduced a Python library, named Jostar, for spectral feature selection. Embedded in Jostar, we proposed a new ranking method for selected features that reaches an agreement between multiple optimization models. Moreover, we implemented a well-known denoising algorithm for the spectral data used in this study. This study benefited from two years of remotely sensed data, captured at multiple instances over the summers of 2019 and 2020, with 24 plots and 18 plots, respectively. Two harvest stage models, early and late harvest, were assessed at two different locations in upstate New York, USA. Six varieties of snap bean were quantified using two components of yield, pod weight and seed length. We used two different vegetation detection algorithms. the Red-Edge Normalized Difference Vegetation Index (RENDVI) and Spectral Angle Mapper (SAM), to subset the fields into vegetation vs. non-vegetation pixels. Partial least squares regression (PLSR) was used as the regression model. Among nine different optimization models embedded in Jostar, we selected the Genetic Algorithm (GA), Ant Colony Optimization (ACO), Simulated Annealing (SA), and Particle Swarm Optimization (PSO) and their resulting joint ranking. The findings show that pod weight can be explained with a high coefficient of determination (R2 = 0.78–0.93) and low root-mean-square error (RMSE = 940–1369 kg/ha) for two years of data. Seed length yield assessment resulted in higher accuracies (R2 = 0.83–0.98) and lower errors (RMSE = 4.245–6.018 mm). Among optimization models used, ACO and SA outperformed others and the SAM vegetation detection approach showed improved results when compared to the RENDVI approach when dense canopies were being examined. Wavelengths at 450, 500, 520, 650, 700, and 760 nm, were identified in almost all data sets and harvest stage models used. The period between 44–55 days after planting (DAP) the optimal time period for yield assessment. Future work should involve transferring the learned concepts to a multispectral system, for eventual operational use; further attention should also be paid to seed length as a ground truth data collection technique, since this yield indicator is far more rapid and straightforward. 
    more » « less
  3. null (Ed.)
    Abstract Measuring soil health indicators (SHIs), particularly soil total nitrogen (TN), is an important and challenging task that affects farmers’ decisions on timing, placement, and quantity of fertilizers applied in the farms. Most existing methods to measure SHIs are in-lab wet chemistry or spectroscopy-based methods, which require significant human input and effort, time-consuming, costly, and are low-throughput in nature. To address this challenge, we develop an artificial intelligence (AI)-driven near real-time unmanned aerial vehicle (UAV)-based multispectral sensing solution (UMS) to estimate soil TN in an agricultural farm. TN is an important macro-nutrient or SHI that directly affects the crop health. Accurate prediction of soil TN can significantly increase crop yield through informed decision making on the timing of seed planting, and fertilizer quantity and timing. The ground-truth data required to train the AI approaches is generated via laser-induced breakdown spectroscopy (LIBS), which can be readily used to characterize soil samples, providing rapid chemical analysis of the samples and their constituents (e.g., nitrogen, potassium, phosphorus, calcium). Although LIBS was previously applied for soil nutrient detection, there is no existing study on the integration of LIBS with UAV multispectral imaging and AI. We train two machine learning (ML) models including multi-layer perceptron regression and support vector regression to predict the soil nitrogen using a suite of data classes including multispectral characteristics of the soil and crops in red (R), near-infrared, and green (G) spectral bands, computed vegetation indices (NDVI), and environmental variables including air temperature and relative humidity (RH). To generate the ground-truth data or the training data for the machine learning models, we determine the N spectrum of the soil samples (collected from a farm) using LIBS and develop a calibration model using the correlation between actual TN of the soil samples and the maximum intensity of N spectrum. In addition, we extract the features from the multispectral images captured while the UAV follows an autonomous flight plan, at different growth stages of the crops. The ML model’s performance is tested on a fixed configuration space for the hyper-parameters using various hyper-parameter optimization techniques at three different wavelengths of the N spectrum. 
    more » « less
  4. Crop yield is related to household food security and community resilience, especially in smallholder agricultural systems. As such, it is crucial to accurately estimate within-season yield in order to provide critical information for farm management and decision making. Therefore, the primary objective of this paper is to assess the most appropriate method, indices, and growth stage for predicting the groundnut yield in smallholder agricultural systems in northern Malawi. We have estimated the yield of groundnut in two smallholder farms using the observed yield and vegetation indices (VIs), which were derived from multitemporal PlanetScope satellite data. Simple linear, multiple linear (MLR), and random forest (RF) regressions were applied for the prediction. The leave-one-out cross-validation method was used to validate the models. The results showed that (i) of the modelling approaches, the RF model using the five most important variables (RF5) was the best approach for predicting the groundnut yield, with a coefficient of determination (R2) of 0.96 and a root mean square error (RMSE) of 0.29 kg/ha, followed by the MLR model (R2 = 0.84, RMSE = 0.84 kg/ha); in addition, (ii) the best within-season stage to accurately predict groundnut yield is during the R5/beginning seed stage. The RF5 model was used to estimate the yield for four different farms. The estimated yields were compared with the total reported yields from the farms. The results revealed that the RF5 model generally accurately estimated the groundnut yields, with the margins of error ranging between 0.85% and 11%. The errors are within the post-harvest loss margins in Malawi. The results indicate that the observed yield and VIs, which were derived from open-source remote sensing data, can be applied to estimate yield in order to facilitate farming and food security planning. 
    more » « less
  5. null (Ed.)
    Cost-effective phenotyping methods are urgently needed to advance crop genetics in order to meet the food, fuel, and fiber demands of the coming decades. Concretely, characterizing plot level traits in fields is of particular interest. Recent developments in high-resolution imaging sensors for UAS (unmanned aerial systems) focused on collecting detailed phenotypic measurements are a potential solution. We introduce canopy roughness as a new plant plot-level trait. We tested its usability with soybean by optical data collected from UAS to estimate biomass. We validate canopy roughness on a panel of 108 soybean [Glycine max (L.) Merr.] recombinant inbred lines in a multienvironment trial during the R2 growth stage. A senseFly eBee UAS platform obtained aerial images with a senseFly S.O.D.A. compact digital camera. Using a structure from motion (SfM) technique, we reconstructed 3D point clouds of the soybean experiment. A novel pipeline for feature extraction was developed to compute canopy roughness from point clouds. We used regression analysis to correlate canopy roughness with field-measured aboveground biomass (AGB) with a leave-one-out cross-validation. Overall, our models achieved a coefficient of determination ( R 2 ) greater than 0.5 in all trials. Moreover, we found that canopy roughness has the ability to discern AGB variations among different genotypes. Our test trials demonstrate the potential of canopy roughness as a reliable trait for high-throughput phenotyping to estimate AGB. As such, canopy roughness provides practical information to breeders in order to select phenotypes on the basis of UAS data. 
    more » « less