skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Applying Machine Learning to Cropland Data Layer for Agro-Geoinformation Discovery
The Cropland Data Layer (CDL) is currently the only subfield level high resolution crop-specific land cover data product over the entire conterminous United States (CONUS). It has been widely used in agricultural industry, business decision support, research, and education worldwide. However, CDL data has its limitations. It is an end-of-season land cover map which is not available within growing season. Moreover, CDLs in early years have many misclassified pixels (relatively low accuracy) due to cloud cover and lack of satellite images. This paper will present the studies of using machine learning technique to address these issues in CDL data. Specifically, we will present the design and implementation of a machine learning model for agro-geoinformation discovery from CDL. Several application scenarios of the proposed model, including prediction of crop cover, crop acreage estimation, in-season crop mapping, and refinement of the earlyyear CDL data, are demonstrated and discussed.  more » « less
Award ID(s):
1739705
PAR ID:
10376793
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS
Page Range / eLocation ID:
1149 to 1152
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. It is still a challenge to generate the timely crop cover map at large geographic area due to the lack of reliable ground truths at early growing season. This paper introduces an efficient method to extract “trusted pixels” from the historical Cropland Data Layer (CDL) data using crop rotation patterns, which can be used to replace the actual ground truth in the crop mapping and other agricultural applications. A case study in the Nebraska state of USA is demonstrated. The common crop rotation patterns of four major crop types, corn, soybeans, winter wheat, and alfalfa, are compared and analyzed. The experiment results show a considerable number of pixels in CDL following the certain crop sequence during the past decade. Each observed crop type has at least one reliable crop rotation pattern. Based on the reliable crop rotation patterns, a great proportion of pixels can be correctly mapped a year ahead of the release of current-year CDL product. These trusted pixels can be potentially used to label training samples for crop type classification at early growing season. 
    more » « less
  2. Crop type information at the field level is vital for many types of research and applications. The United States Department of Agriculture (USDA) provides information on crop types for US cropland as a Cropland Data Layer (CDL). However, CDL is only available at the end of the year after the crop growing season. Therefore, CDL is unable to support in-season research and decision-making regarding crop loss estimation, yield estimation, and grain pricing. The USDA mostly relies on field survey and farmers’ reports for the ground truth to train image classification models, which is one of the major reasons for the delayed release of CDL. This research aims to use trusted pixels as ground truth to train classification models. Trusted pixels are pixels which follow a specific crop rotation pattern. These trusted pixels are used to train image classification models for the classification of in-season Landsat images to identify major crop types. Six different classification algorithms are investigated and tested to select the best algorithm for this study. The Random Forest algorithm stands out among selected algorithms. This study classified Landsat scenes between May and mid-August for Iowa. The overall agreements of classification results with CDL in 2017 are 84%, 94%, and 96% for May, June, and July, respectively. The classification accuracies have been assessed through 683 ground truth data points collected from the fields. The overall accuracies of single date multi-band image classification are 84%, 89% and 92% for May, June, and July, respectively. The result also shows higher accuracy (94–95%) can be achieved through multi-date image classification compared to single date image classification. 
    more » « less
  3. Mapping crop types and land cover in smallholder farming systems in sub-Saharan Africa remains a challenge due to data costs, high cloud cover, and poor temporal resolution of satellite data. With improvement in satellite technology and image processing techniques, there is a potential for integrating data from sensors with different spectral characteristics and temporal resolutions to effectively map crop types and land cover. In our Malawi study area, it is common that there are no cloud-free images available for the entire crop growth season. The goal of this experiment is to produce detailed crop type and land cover maps in agricultural landscapes using the Sentinel-1 (S-1) radar data, Sentinel-2 (S-2) optical data, S-2 and PlanetScope data fusion, and S-1 C2 matrix and S-1 H/α polarimetric decomposition. We evaluated the ability to combine these data to map crop types and land cover in two smallholder farming locations. The random forest algorithm, trained with crop and land cover type data collected in the field, complemented with samples digitized from Google Earth Pro and DigitalGlobe, was used for the classification experiments. The results show that the S-2 and PlanetScope fused image + S-1 covariance (C2) matrix + H/α polarimetric decomposition (an entropy-based decomposition method) fusion outperformed all other image combinations, producing higher overall accuracies (OAs) (>85%) and Kappa coefficients (>0.80). These OAs represent a 13.53% and 11.7% improvement on the Sentinel-2-only (OAs < 80%) experiment for Thimalala and Edundu, respectively. The experiment also provided accurate insights into the distribution of crop and land cover types in the area. The findings suggest that in cloud-dense and resource-poor locations, fusing high temporal resolution radar data with available optical data presents an opportunity for operational mapping of crop types and land cover to support food security and environmental management decision-making. 
    more » « less
  4. The National Agricultural Statistics Service, the statistical arm of the US Department of Agriculture, and the Multi-Resolution Land Characteristics Consortium, a group of the US federal agencies, collect and publish several land-use and land-cover data sets. The aim of this study is to analyze the consistency of forestland estimates based on two widely used, publicly available products: the National Land-Cover Database (NLCD) and Cropland Data Layer (CDL). Both remote-sensing-based products provide raster-formatted land-cover categorization at a spatial resolution of 30 m. Although the processing of the yearly published CDL non-agricultural land-cover data is based on less frequently updated NLCD, the consistency of large-area forestland mapping between these two datasets has not been assessed. To assess the similarities and the differences between CDL- and NLCD-based forestland mappings for the state of North Carolina, we overlay the two data products for the years 2011 and 2016 in ArcMap 10.5.1 and analyze the location and attributes of the matched and mismatched forestland. We find that the mismatch is relatively smaller for the areas of the state where forests occupy larger shares of the total land, and that the relative mismatch is smaller in 2011 when compared to 2016. We also find that a large portion of the forestland mismatch is attributable to the dynamics of re-growth of periodically harvested and otherwise disturbed forests. Our results underscore the need for a holistic approach to data preparation, data attribution, and data accuracy when performing high-scale map-based analyses using each of these products. 
    more » « less
  5. Abstract. Agriculture plays a major role in eradicating poverty, promoting prosperity, and nourishing a projected 10 billion people by 2050 globally. In a changing climate, achieving optimal agricultural yields requires a deeper understanding of available natural resources and crops. This is especially important for places like the Navajo Nation, which faces significant challenges in food supply chain management due to various factors such as water demand, water quality, and insufficient information about land fertility and crops timings/seasons. Additionally, it is the largest Native American reservation in the U.S. It covers 27,425 square miles across Arizona, Utah, and New Mexico and has a population of 165,158 people, according to the 2020 census. Agriculture has been a key part of life in the Navajo Nation since the late 19th and early 20th centuries, playing a big role in the region’s development and stability. However, the lack of knowledge about decisions and actions during the crop growing season has resulted in lower crop productivity, as evidenced by the USDA statistical report for the Navajo Nation in 2012 and 2017. To support farmers by providing better decision-making and actionable insights, high-resolution, open-source Sentinel-2 satellite images are being used to develop advanced crop mapping techniques for identifying the spatial extent of various agricultural crops in the Navajo Nation. To address this, a collection of research papers was reviewed, leading to the development of a new methodology for analysing Sentinel-2 data from the 2017 and 2023 growing seasons within the Navajo Nation. The collected data was pre-processed by creating monthly median composites of surface reflectance to remove noise and enhance the results more accurately. After preprocessing, spectral indices were calculated from the spectral bands, including NDVI (Normalized Difference Vegetation Index), EVI (Enhanced Vegetation Index), GCVI (Green Chlorophyll Vegetation Index), and LSWI (Land Surface Water Index), to differentiate the crops more precisely. The training datasets were obtained from the USDA’s Crop Data Layer (CDL) and split into 80% for training and 20% for validating the Random Forest supervised classification algorithm. The classification resulted in an accuracy of 80%. Finally, the accuracy of the results was compared with independent ground truth data. This research identifies notable discrepancies between the CDL data and the Navajo Nation agricultural census statistical report, particularly in estimating corn acreage for the Chinle and Fort Defiance agencies. Ultimately this approach information is used to provide actionable insights to Navajo Nation farmers. 
    more » « less