skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on December 1, 2025

Title: Regional-Scale Detection of Palms Using VHR Satellite Imagery and Deep Learning in the Guyanese Rainforest
Arecaceae (palms) play a crucial role for native communities and wildlife in the Amazon region. This study presents a first-of-its-kind regional-scale spatial cataloging of palms using remotely sensed data for the country of Guyana. Using very high-resolution satellite images from the GeoEye-1 and WorldView-2 sensor platforms, which collectively cover an area of 985 km2, a total of 472,753 individual palm crowns are detected with F1 scores of 0.76 and 0.79, respectively, using a convolutional neural network (CNN) instance segmentation model. An example of CNN model transference between images is presented, emphasizing the limitation and practical application of this approach. A method is presented to optimize precision and recall using the confidence of the detection features; this results in a decrease of 45% and 31% in false positive detections, with a moderate increase in false negative detections. The sensitivity of the CNN model to the size of the training set is evaluated, showing that comparable metrics could be achieved with approximately 50% of the samples used in this study. Finally, the diameter of the palm crown is calculated based on the polygon identified by mask detection, resulting in an average of 7.83 m, a standard deviation of 1.05 m, and a range of {4.62, 13.90} m for the GeoEye-1 image. Similarly, for the WorldView-2 image, the average diameter is 8.08 m, with a standard deviation of 0.70 m and a range of {4.82, 15.80} m.  more » « less
Award ID(s):
2501699
PAR ID:
10597736
Author(s) / Creator(s):
;
Publisher / Repository:
MDPI
Date Published:
Journal Name:
Remote Sensing
Volume:
16
Issue:
24
ISSN:
2072-4292
Page Range / eLocation ID:
4642
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Palms play an outsized role in tropical forests and are important resources for humans and wildlife. A central question in tropical ecosystems is understanding palm distribution and abundance. However, accurately identifying and localizing palms in geospatial imagery presents significant challenges due to dense vegetation, overlapping canopies, and variable lighting conditions in mixed-forest landscapes. Addressing this, we introduce PalmProbNet, a probabilistic approach utilizing transfer learning to analyze high-resolution UAV-derived orthomosaic imagery, enabling the detection of palm trees within the dense canopy of the Ecuadorian Rainforest. This approach represents a substantial advancement in automated palm detection, effectively pinpointing palm presence and locality in mixed tropical rainforests. Our process begins by generating an orthomosaic image from UAV images, from which we extract and label palm and non-palm image patches in two distinct sizes. These patches are then used to train models with an identical architecture, consisting of an unaltered pre-trained ResNet-18 and a Multilayer Perceptron (MLP) with specifically trained parameters. Subsequently, PalmProbNet employs a sliding window technique on the landscape orthomosaic, using both small and large window sizes to generate a probability heatmap. This heatmap effectively visualizes the distribution of palms, showcasing the scalability and adaptability of our approach in various forest densities. Despite the challenging terrain, our method demonstrated remarkable performance, achieving an accuracy of 97.32% and a Cohen's κ of 94.59% in testing. 
    more » « less
  2. State-of-the-art deep learning technology has been successfully applied to relatively small selected areas of very high spatial resolution (0.15 and 0.25 m) optical aerial imagery acquired by a fixed-wing aircraft to automatically characterize ice-wedge polygons (IWPs) in the Arctic tundra. However, any mapping of IWPs at regional to continental scales requires images acquired on different sensor platforms (particularly satellite) and a refined understanding of the performance stability of the method across sensor platforms through reliable evaluation assessments. In this study, we examined the transferability of a deep learning Mask Region-Based Convolutional Neural Network (R-CNN) model for mapping IWPs in satellite remote sensing imagery (~0.5 m) covering 272 km2 and unmanned aerial vehicle (UAV) (0.02 m) imagery covering 0.32 km2. Multi-spectral images were obtained from the WorldView-2 satellite sensor and pan-sharpened to ~0.5 m, and a 20 mp CMOS sensor camera onboard a UAV, respectively. The training dataset included 25,489 and 6022 manually delineated IWPs from satellite and fixed-wing aircraft aerial imagery near the Arctic Coastal Plain, northern Alaska. Quantitative assessments showed that individual IWPs were correctly detected at up to 72% and 70%, and delineated at up to 73% and 68% F1 score accuracy levels for satellite and UAV images, respectively. Expert-based qualitative assessments showed that IWPs were correctly detected at good (40–60%) and excellent (80–100%) accuracy levels for satellite and UAV images, respectively, and delineated at excellent (80–100%) level for both images. We found that (1) regardless of spatial resolution and spectral bands, the deep learning Mask R-CNN model effectively mapped IWPs in both remote sensing satellite and UAV images; (2) the model achieved a better accuracy in detection with finer image resolution, such as UAV imagery, yet a better accuracy in delineation with coarser image resolution, such as satellite imagery; (3) increasing the number of training data with different resolutions between the training and actual application imagery does not necessarily result in better performance of the Mask R-CNN in IWPs mapping; (4) and overall, the model underestimates the total number of IWPs particularly in terms of disjoint/incomplete IWPs. 
    more » « less
  3. Vision graph neural networks (ViG) have demonstrated promise in vision tasks as a competitive alternative to conventional convolutional neural nets (CNN) and transformers (ViTs); however, common graph construction methods, such as k-nearest neighbor (KNN), can be expensive on larger images. While methods such as Sparse Vision Graph Attention (SVGA) have shown promise, SVGA’s fixed step scale can lead to over-squashing and missing multiple connections to gain the same information that could be gained from a long-range link. Through this observation, we propose a new graph construction method, Logarithmic Scalable Graph Construction (LSGC) to enhance performance by limiting the number of long-range links. To this end, we propose LogViG, a novel hybrid CNN-GNN model that utilizes LSGC. Furthermore, inspired by the successes of multiscale and high-resolution architectures, we introduce and apply a high-resolution branch and fuse features between our high-resolution and low-resolution branches for a multi-scale high-resolution Vision GNN network. Extensive experiments show that LogViG beats existing ViG, CNN, and ViT architectures in terms of accuracy, GMACs, and parameters on image classification and semantic segmentation tasks. Our smallest model, Ti-LogViG, achieves an average top-1 accuracy on ImageNet-1K of 79.9% with a standard deviation of ± 0.2%, 1.7% higher average accuracy than Vision GNN with a 24.3% reduction in parameters and 35.3% reduction in GMACs. Our work shows that leveraging long-range links in graph construction for ViGs through our proposed LSGC can exceed the performance of current state-of-the-art ViGs. 
    more » « less
  4. Deep learning models have demonstrated significant advantages over traditional algorithms in image processing tasks like object detection. However, a large amount of data are needed to train such deep networks, which limits their application to tasks such as biometric recognition that require more training samples for each class (i.e., each individual). Researchers developing such complex systems rely on real biometric data, which raises privacy concerns and is restricted by the availability of extensive, varied datasets. This paper proposes a generative adversarial network (GAN)-based solution to produce training data (palm images) for improved biometric (palmprint-based) recognition systems. We investigate the performance of the most recent StyleGAN models in generating a thorough contactless palm image dataset for application in biometric research. Training on publicly available H-PolyU and IIDT palmprint databases, a total of 4839 images were generated using StyleGAN models. SIFT (Scale-Invariant Feature Transform) was used to find uniqueness and features at different sizes and angles, which showed a similarity score of 16.12% with the most recent StyleGAN3-based model. For the regions of interest (ROIs) in both the palm and finger, the average similarity scores were 17.85%. We present the Frechet Inception Distance (FID) of the proposed model, which achieved a 16.1 score, demonstrating significant performance. These results demonstrated StyleGAN as effective in producing unique synthetic biometric images. 
    more » « less
  5. Cloud detection is an inextricable pre-processing step in remote sensing image analysis workflows. Most of the traditional rule-based and machine-learning-based algorithms utilize low-level features of the clouds and classify individual cloud pixels based on their spectral signatures. Cloud detection using such approaches can be challenging due to a multitude of factors including harsh lighting conditions, the presence of thin clouds, the context of surrounding pixels, and complex spatial patterns. In recent studies, deep convolutional neural networks (CNNs) have shown outstanding results in the computer vision domain. These methods are practiced for better capturing the texture, shape as well as context of images. In this study, we propose a deep learning CNN approach to detect cloud pixels from medium-resolution satellite imagery. The proposed CNN accounts for both the low-level features, such as color and texture information as well as high-level features extracted from successive convolutions of the input image. We prepared a cloud-pixel dataset of approximately 7273 randomly sampled 320 by 320 pixels image patches taken from a total of 121 Landsat-8 (30m) and Sentinel-2 (20m) image scenes. These satellite images come with cloud masks. From the available data channels, only blue, green, red, and NIR bands are fed into the model. The CNN model was trained on 5300 image patches and validated on 1973 independent image patches. As the final output from our model, we extract a binary mask of cloud pixels and non-cloud pixels. The results are benchmarked against established cloud detection methods using standard accuracy metrics. 
    more » « less