skip to main content


Title: Unsupervised Diffusion and Volume Maximization-Based Clustering of Hyperspectral Images
Hyperspectral images taken from aircraft or satellites contain information from hundreds of spectral bands, within which lie latent lower-dimensional structures that can be exploited for classifying vegetation and other materials. A disadvantage of working with hyperspectral images is that, due to an inherent trade-off between spectral and spatial resolution, they have a relatively coarse spatial scale, meaning that single pixels may correspond to spatial regions containing multiple materials. This article introduces the Diffusion and Volume maximization-based Image Clustering (D-VIC) algorithm for unsupervised material clustering to address this problem. By directly incorporating pixel purity into its labeling procedure, D-VIC gives greater weight to pixels corresponding to a spatial region containing just a single material. D-VIC is shown to outperform comparable state-of-the-art methods in extensive experiments on a range of hyperspectral images, including land-use maps and highly mixed forest health surveys (in the context of ash dieback disease), implying that it is well-equipped for unsupervised material clustering of spectrally-mixed hyperspectral datasets.  more » « less
Award ID(s):
1924513
NSF-PAR ID:
10471334
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
MDPI
Date Published:
Journal Name:
Remote Sensing
Volume:
15
Issue:
4
ISSN:
2072-4292
Page Range / eLocation ID:
1053
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We propose a method for the unsupervised clustering of hyperspectral images based on spatially regularized spectral clustering with ultrametric path distances. The proposed method efficiently combines data density and spectral-spatial geometry to distinguish between material classes in the data, without the need for training labels. The proposed method is efficient, with quasilinear scaling in the number of data points, and enjoys robust theoretical performance guarantees. Extensive experiments on synthetic and real HSI data demonstrate its strong performance compared to benchmark and state-of-the-art methods. Indeed, the proposed method not only achieves excellent labeling accuracy, but also efficiently estimates the number of clusters. Thus, unlike almost all existing hyperspectral clustering methods, the proposed algorithm is essentially parameter-free. 
    more » « less
  2. Most applications of multispectral imaging are explicitly or implicitly dependent on the dimensionality and topology of the spectral mixing space. Mixing space characterization refers to the identification of salient properties of the set of pixel reflectance spectra comprising an image (or compilation of images). The underlying premise is that this set of spectra may be described as a low dimensional manifold embedded in a high dimensional vector space. Traditional mixing space characterization uses the linear dimensionality reduction offered by Principal Component Analysis to find projections of pixel spectra onto orthogonal linear subspaces, prioritized by variance. Here, we consider the potential for recent advances in nonlinear dimensionality reduction (specifically, manifold learning) to contribute additional useful information for multispectral mixing space characterization. We integrate linear and nonlinear methods through a novel approach called Joint Characterization (JC). JC is comprised of two components. First, spectral mixture analysis (SMA) linearly projects the high-dimensional reflectance vectors onto a 2D subspace comprising the primary mixing continuum of substrates, vegetation, and dark features (e.g., shadow and water). Second, manifold learning nonlinearly maps the high-dimensional reflectance vectors into a low-D embedding space while preserving manifold topology. The SMA output is physically interpretable in terms of material abundances. The manifold learning output is not generally physically interpretable, but more faithfully preserves high dimensional connectivity and clustering within the mixing space. Used together, the strengths of SMA may compensate for the limitations of manifold learning, and vice versa. Here, we illustrate JC through application to thematic compilations of 90 Sentinel-2 reflectance images selected from a diverse set of biomes and land cover categories. Specifically, we use globally standardized Substrate, Vegetation, and Dark (S, V, D) endmembers (EMs) for SMA, and Uniform Manifold Approximation and Projection (UMAP) for manifold learning. The value of each (SVD and UMAP) model is illustrated, both separately and jointly. JC is shown to successfully characterize both continuous gradations (spectral mixing trends) and discrete clusters (land cover class distinctions) within the spectral mixing space of each land cover category. These features are not clearly identifiable from SVD fractions alone, and not physically interpretable from UMAP alone. Implications are discussed for the design of models which can reliably extract and explainably use high-dimensional spectral information in spatially mixed pixels—a principal challenge in optical remote sensing.

     
    more » « less
  3. Abstract Most semantic segmentation approaches of big data hyperspectral images use and require preprocessing steps in the form of patching to accurately classify diversified land cover in remotely sensed images. These approaches use patching to incorporate the rich spatial neighborhood information in images and exploit the simplicity and segmentability of the most common datasets. In contrast, most landmasses in the world consist of overlapping and diffused classes, making neighborhood information weaker than what is seen in common datasets. To combat this common issue and generalize the segmentation models to more complex and diverse hyperspectral datasets, in this work, we propose a novel flagship model: Clustering Ensemble U-Net. Our model uses the ensemble method to combine spectral information extracted from convolutional neural network training on a cluster of landscape pixels. Our model outperforms existing state-of-the-art hyperspectral semantic segmentation methods and gets competitive performance with and without patching when compared to baseline models. We highlight our model’s high performance across six popular hyperspectral datasets including Kennedy Space Center, Houston, and Indian Pines, then compare them to current top-performing models. 
    more » « less
  4. Messinger, David W. ; Velez-Reyes, Miguel (Ed.)
    Recent advances in data fusion provide the capability to obtain enhanced hyperspectral data with high spatial and spectral information content, thus allowing for an improved classification accuracy. Although hyperspectral image classification is a highly investigated topic in remote sensing, each classification technique presents different advantages and disadvantages. For example; methods based on morphological filtering are particularly good at classifying human-made structures with basic geometrical spatial shape, like houses and buildings. On the other hand, methods based on spectral information tend to perform better classification in natural scenery with more shape diversity such as vegetation and soil areas. Even more, for those classes with mixed pixels, small training data or objects with similar re ectance values present a higher challenge to obtain high classification accuracy. Therefore, it is difficult to find just one technique that provides the highest accuracy of classification for every class present in an image. This work proposes a decision fusion approach aiming to increase classification accuracy of enhanced hyperspectral images by integrating the results of multiple classifiers. Our approach is performed in two-steps: 1) the use of machine learning algorithms such as Support Vector Machines (SVM), Deep Neural Networks (DNN) and Class-dependent Sparse Representation will generate initial classification data, then 2) the decision fusion scheme based on a Convolutional Neural Network (CNN) will integrate all the classification results into a unified classification rule. In particular, the CNN receives as input the different probabilities of pixel values from each implemented classifier, and using a softmax activation function, the final decision is estimated. We present results showing the performance of our method using different hyperspectral image datasets. 
    more » « less
  5. The monitoring of agronomic parameters like biomass, water stress, and plant health can benefit from synergistic use of all available remotely sensed information. Multispectral imagery has been used for this purpose for decades, largely with vegetation indices (VIs). Many multispectral VIs exist, typically relying on a single feature—the spectral red edge—for information. Where hyperspectral imagery is available, spectral mixture models can use the full VSWIR spectrum to yield further insight, simultaneously estimating area fractions of multiple materials within mixed pixels. Here we investigate the relationships between VIs and mixture models by comparing hyperspectral endmember fractions to six common multispectral VIs in California’s diverse crops and soils. In so doing, we isolate spectral effects from sensor- and acquisition-specific variability associated with atmosphere, illumination, and view geometry. Specifically, we compare: (1) fractional area of photosynthetic vegetation (Fv) from 64,000,000 3–5 m resolution AVIRIS-ng reflectance spectra; and (2) six popular VIs (NDVI, NIRv, EVI, EVI2, SR, DVI) computed from simulated Planet SuperDove reflectance spectra derived from the AVIRIS-ng spectra. Hyperspectral Fv and multispectral VIs are compared using both parametric (Pearson correlation, ρ) and nonparametric (Mutual Information, MI) metrics. Four VIs (NIRv, DVI, EVI, EVI2) showed strong linear relationships with Fv (ρ > 0.94; MI > 1.2). NIRv and DVI showed strong interrelation (ρ > 0.99, MI > 2.4), but deviated from a 1:1 correspondence with Fv. EVI and EVI2 were strongly interrelated (ρ > 0.99, MI > 2.3) and more closely approximated a 1:1 relationship with Fv. In contrast, NDVI and SR showed a weaker, nonlinear, heteroskedastic relation to Fv (ρ < 0.84, MI = 0.69). NDVI exhibited both especially severe sensitivity to unvegetated background (–0.05 < NDVI < +0.6) and saturation (0.2 < Fv < 0.8 for NDVI = 0.7). The self-consistent atmospheric correction, radiometry, and sun-sensor geometry allows this simulation approach to be further applied to indices, sensors, and landscapes worldwide.

     
    more » « less