skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on April 11, 2026

Title: Accounting for Spatial Variability with the Histogram of Oriented Gradients Based Masking Improves Performance of Masked Autoencoder over Hyperspectral Satellite Imagery (Student Abstract)
Masked autoencoders employ random masking to effectively reconstruct input images using self-supervised techniques, which allows for efficient training on large datasets. However, the random masking strategy does not adequately tap into information encapsulated within high-dimensional hyperspectral satellite imagery that is used in several domains. We propose a novel masking strategy, HOGMAE, based on the Histogram of Oriented Gradients that incorporates rich information inherent within satellite images during the mask creation step. Our experiments, over a hyperspectral satellite dataset, demonstrate the effectiveness of our methodology.  more » « less
Award ID(s):
2312319 1931363
PAR ID:
10614762
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
AAAI
Date Published:
Journal Name:
Proceedings of the AAAI Conference on Artificial Intelligence
Volume:
39
Issue:
28
ISSN:
2159-5399
Page Range / eLocation ID:
29365 to 29367
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper, we look at how depth data can benefit existing object masking methods applied in occluded scenes. Masking the pixel locations of objects within scenes helps computers get a spatial awareness of where objects are within images. The current state-of-the-art algorithm for masking objects in images is Mask R-CNN, which builds on the Faster R-CNN network to mask object pixels rather than just detecting their bounding boxes. This paper examines the weaknesses Mask R-CNN has in masking people when they are occluded in a frame. It then looks at how depth data gathered from an RGB-D sensor can be used. We provide a case study to show how simply applying thresholding methods on the depth information can aid in distinguishing occluded persons. The intention of our research is to examine how features from depth data can benefit object pixel masking methods in an explainable manner, especially in complex scenes with multiple objects. 
    more » « less
  2. Hyperspectral cameras collect detailed spectral information at each image pixel, contributing to the identification of image features. The rich spectral content of hyperspectral imagery has led to its application in diverse fields of study. This study focused on cloud classification using a dataset of hyperspectral sky images captured by a Resonon PIKA XC2 camera. The camera records images using 462 spectral bands, ranging from 400 to 1000 nm, with a spectral resolution of 1.9 nm. Our preliminary/unlabeled dataset comprised 33 parent hyperspectral images (HSI), each a substantial unlabeled image measuring 4402-by-1600 pixels. With the meteorological expertise within our team, we manually labeled pixels by extracting 10 to 20 sample patches from each parent image, each patch consisting of a 50-by-50 pixel field. This process yielded a collection of 444 patches, each categorically labeled into one of seven cloud and sky condition categories. To embed the inherent data structure while classifying individual pixels, we introduced an innovative technique to boost classification accuracy by incorporating patch-specific information into each pixel’s feature vector. The posterior probabilities generated by these classifiers, which capture the unique attributes of each patch, were subsequently concatenated with the pixel’s original spectral data to form an augmented feature vector. We then applied a final classifier to map the augmented vectors to the seven cloud/sky categories. The results compared favorably to the baseline model devoid of patch-origin embedding, showing that incorporating the spatial context along with the spectral information inherent in hyperspectral images enhances the classification accuracy in hyperspectral cloud classification. The dataset is available on IEEE DataPort. 
    more » « less
  3. Deep Neural Networks are powerful tools for understanding complex patterns and making decisions. However, their black-box nature impedes a complete understanding of their inner workings. Saliency-Guided Training (SGT) methods try to highlight the prominent features in the model's training based on the output to alleviate this problem. These methods use back-propagation and modified gradients to guide the model toward the most relevant features while keeping the impact on the prediction accuracy negligible. SGT makes the model's final result more interpretable by masking input partially. In this way, considering the model's output, we can infer how each segment of the input affects the output. In the particular case of image as the input, masking is applied to the input pixels. However, the masking strategy and number of pixels which we mask, are considered as a hyperparameter. Appropriate setting of masking strategy can directly affect the model's training. In this paper, we focus on this issue and present our contribution. We propose a novel method to determine the optimal number of masked images based on input, accuracy, and model loss during the training. The strategy prevents information loss which leads to better accuracy values. Also, by integrating the model's performance in the strategy formula, we show that our model represents the salient features more meaningful. Our experimental results demonstrate a substantial improvement in both model accuracy and the prominence of saliency, thereby affirming the effectiveness of our proposed solution. 
    more » « less
  4. Abstract Most semantic segmentation approaches of big data hyperspectral images use and require preprocessing steps in the form of patching to accurately classify diversified land cover in remotely sensed images. These approaches use patching to incorporate the rich spatial neighborhood information in images and exploit the simplicity and segmentability of the most common datasets. In contrast, most landmasses in the world consist of overlapping and diffused classes, making neighborhood information weaker than what is seen in common datasets. To combat this common issue and generalize the segmentation models to more complex and diverse hyperspectral datasets, in this work, we propose a novel flagship model: Clustering Ensemble U-Net. Our model uses the ensemble method to combine spectral information extracted from convolutional neural network training on a cluster of landscape pixels. Our model outperforms existing state-of-the-art hyperspectral semantic segmentation methods and gets competitive performance with and without patching when compared to baseline models. We highlight our model’s high performance across six popular hyperspectral datasets including Kennedy Space Center, Houston, and Indian Pines, then compare them to current top-performing models. 
    more » « less
  5. Understanding the mineralogy and geochemistry of the subsurface is key when assessing and exploring for mineral deposits. To achieve this goal, rapid acquisition and accurate interpretation of drill core data are essential. Hyperspectral shortwave infrared imaging is a rapid and non-destructive analytical method widely used in the minerals industry to map minerals with diagnostic features in core samples. In this paper, we present an automated method to interpret hyperspectral shortwave infrared data on drill core to decipher major felsic rock-forming minerals using supervised machine learning techniques for processing, masking, and extracting mineralogical and textural information. This study utilizes a co-registered training dataset that integrates hyperspectral data with quantitative scanning electron microscopy data instead of spectrum matching using a spectral library. Our methodology overcomes previous limitations in hyperspectral data interpretation for the full mineralogy (i.e., quartz and feldspar) caused by the need to identify spectral features of minerals; in particular, it detects the presence of minerals that are considered invisible in traditional shortwave infrared hyperspectral analysis. 
    more » « less