skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Collection of a Hyperspectral Atmospheric Cloud Dataset and Enhancing Pixel Classification through Patch-Origin Embedding
Hyperspectral cameras collect detailed spectral information at each image pixel, contributing to the identification of image features. The rich spectral content of hyperspectral imagery has led to its application in diverse fields of study. This study focused on cloud classification using a dataset of hyperspectral sky images captured by a Resonon PIKA XC2 camera. The camera records images using 462 spectral bands, ranging from 400 to 1000 nm, with a spectral resolution of 1.9 nm. Our preliminary/unlabeled dataset comprised 33 parent hyperspectral images (HSI), each a substantial unlabeled image measuring 4402-by-1600 pixels. With the meteorological expertise within our team, we manually labeled pixels by extracting 10 to 20 sample patches from each parent image, each patch consisting of a 50-by-50 pixel field. This process yielded a collection of 444 patches, each categorically labeled into one of seven cloud and sky condition categories. To embed the inherent data structure while classifying individual pixels, we introduced an innovative technique to boost classification accuracy by incorporating patch-specific information into each pixel’s feature vector. The posterior probabilities generated by these classifiers, which capture the unique attributes of each patch, were subsequently concatenated with the pixel’s original spectral data to form an augmented feature vector. We then applied a final classifier to map the augmented vectors to the seven cloud/sky categories. The results compared favorably to the baseline model devoid of patch-origin embedding, showing that incorporating the spatial context along with the spectral information inherent in hyperspectral images enhances the classification accuracy in hyperspectral cloud classification. The dataset is available on IEEE DataPort.  more » « less
Award ID(s):
2003740
PAR ID:
10647624
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
MDPI Remote Sensing
Date Published:
Journal Name:
Remote Sensing
Volume:
16
Issue:
17
ISSN:
2072-4292
Page Range / eLocation ID:
3315
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The CloudPatch-7 Hyperspectral Dataset comprises a manually curated collection of hyperspectral images, focused on pixel classification of atmospheric cloud classes. This labeled dataset features 380 patches, each a 50x50 pixel grid, derived from 28 larger, unlabeled parent images approximately 5000x1500 pixels in size. Captured using the Resonon PIKA XC2 camera, these images span 462 spectral bands from 400 to 1000 nm. Each patch is extracted from a parent image ensuring that its pixels fall within one of seven atmospheric conditions: Dense Dark Cumuliform Cloud, Dense Bright Cumuliform Cloud, Semi-transparent Cumuliform Cloud, Dense Cirroform Cloud, Semi-transparent Cirroform Cloud, Clear Sky - Low Aerosol Scattering (dark), and Clear Sky - Moderate to High Aerosol Scattering (bright). Incorporating contextual information from surrounding pixels enhances pixel classification into these 7 classes, making this dataset a valuable resource for spectral analysis, environmental monitoring, atmospheric science research, and testing machine learning applications that require contextual data. Parent images are very big in size, but they can be made available upon request. 
    more » « less
  2. This dataset includes 30 hyperspectral cloud images captured during the Summer and Fall of 2022 at Auburn University at Montgomery, Alabama, USA (Latitude N, Longitude W) using aResonon Pika XC2 Hyperspectral Imaging Camera. Utilizing the Spectronon software, the images were recorded with integration times between 9.0-12.0 ms, a frame rate of approximately 45 Hz, and a scan rate of 0.93 degrees per second. The images are calibrated to give spectral radiance in microflicks at 462 spectral bands in the 400 – 1000 nm wavelength region with a spectral resolution of 1.9 nm. A 17 m focal length objective lens was used giving a field of view equal to 30.8 degrees and an integration field of view of 0.71 mrad. These settings enable detailed spectral analysis of both dynamic cloud formations and clear sky conditions. Funded by NSF grant 2003740, this dataset is designed to advance understanding of diffuse solar radiation as influenced by cloud coverage.  The dataset is organized into 30 folders, each containing a hyperspectral image file (.bip), a header file (.hdr) with metadata, and an RGB render for visual inspection. Additional metadata, including date, time, central pixel azimuth, and altitude, are cataloged in an accompanying MS Excel file. A custom Python program is also provided to facilitate the reading and display of the HSI files.  The images can also be read and analyzed using the free version of the Spectron software available at https://resonon.com/software. To enrich this dataset, we have added a supplementary ZIP file containing multispectral (4-channel) image versions of the original hyperspectral scenes, together with the corresponding per-pixel photon flux and spectral radiance values computed from the full spectrum. These additions extend the dataset’s utility for machine learning and data fusion research by enabling comparative analysis between reduced-band multispectral imagery and full-spectrum hyperspectral data. The ExpandAI Challenge task is to develop models capable of predicting photon flux and radiance—derived from all 462 hyperspectral bands—using only the four multispectral channels. This benchmark aims to stimulate innovation in spectral information recovery, spectral-spatial inference, and physically informed deep learning for atmospheric imaging applications. 
    more » « less
  3. A novel hyperspectral image classification algorithm is proposed and demonstrated on benchmark hyperspectral images. We also introduce a hyperspectral sky imaging dataset that we are collecting for detecting the amount and type of cloudiness. The algorithm designed to be applied to such systems could improve the spatial and temporal resolution of cloud information vital to understanding Earth’s climate. We discuss the nature of our HSI-Cloud dataset being collected and an algorithm we propose for processing the dataset using a categorical-boosting method. The proposed method utilizes multiple clusterings to augment the dataset and achieves higher pixel classification accuracy. Creating categorical features via clustering enriches the data representation and improves boosting ensembles. For the experimental datasets used in this paper, gradient boosting methods performed favorably to the benchmark algorithms. 
    more » « less
  4. Cloud detection is an inextricable pre-processing step in remote sensing image analysis workflows. Most of the traditional rule-based and machine-learning-based algorithms utilize low-level features of the clouds and classify individual cloud pixels based on their spectral signatures. Cloud detection using such approaches can be challenging due to a multitude of factors including harsh lighting conditions, the presence of thin clouds, the context of surrounding pixels, and complex spatial patterns. In recent studies, deep convolutional neural networks (CNNs) have shown outstanding results in the computer vision domain. These methods are practiced for better capturing the texture, shape as well as context of images. In this study, we propose a deep learning CNN approach to detect cloud pixels from medium-resolution satellite imagery. The proposed CNN accounts for both the low-level features, such as color and texture information as well as high-level features extracted from successive convolutions of the input image. We prepared a cloud-pixel dataset of approximately 7273 randomly sampled 320 by 320 pixels image patches taken from a total of 121 Landsat-8 (30m) and Sentinel-2 (20m) image scenes. These satellite images come with cloud masks. From the available data channels, only blue, green, red, and NIR bands are fed into the model. The CNN model was trained on 5300 image patches and validated on 1973 independent image patches. As the final output from our model, we extract a binary mask of cloud pixels and non-cloud pixels. The results are benchmarked against established cloud detection methods using standard accuracy metrics. 
    more » « less
  5. null (Ed.)
    An algorithm for clustering hyperspectral images (HSI) based on diffusion geometry in the space of high-dimensional image patches is proposed. By using the patch structure of the HSI, robustness to noise is achieved in the clustering process. Results on real hyperspectral data indicate the effectiveness of working in the space of HSI patches, compared to working in the space of HSI pixels. 
    more » « less