skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Endmember Extraction on the Grassmannian
Endmember extraction plays a prominent role in a variety of data analysis problems as endmembers often correspond to data representing the purest or best representative of some feature. Identifying endmembers then can be useful for further identification and classification tasks. In settings with high-dimensional data, such as hyperspectral imagery, it can be useful to consider endmembers that are subspaces as they are capable of capturing a wider range of variations of a signature. The endmember extraction problem in this setting thus translates to finding the vertices of the convex hull of a set of points on a Grassmannian. In the presence of noise, it can be less clear whether a point should be considered a vertex. In this paper, we propose an algorithm to extract endmembers on a Grassmannian, identify subspaces of interest that lie near the boundary of a convex hull, and demonstrate the use of the algorithm on a synthetic example and on the 220 spectral band AVIRIS Indian Pines hyperspectral image.  more » « less
Award ID(s):
1633830
PAR ID:
10064957
Author(s) / Creator(s):
Date Published:
Journal Name:
2018 IEEE Data Science Workshop (DSW 2018)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We extend the self-organizing mapping algorithm to the problem of visualizing data on Grassmann manifolds. In this setting, a collection of k points in n-dimensions is represented by a k-dimensional subspace, e.g., via the singular value or QR-decompositions. Data assembled in this way is challenging to visualize given abstract points on the Grassmannian do not reside in Euclidean space. The extension of the SOM algorithm to this geometric setting only requires that distances between two points can be measured and that any given point can be moved towards a presented pattern. The similarity between two points on the Grassmannian is measured in terms of the principal angles between subspaces, e.g., the chordal distance. Further, we employ a formula for moving one subspace towards another along the shortest path, i.e., the geodesic between two points on the Grassmannian. This enables a faithful implementation of the SOM approach for visualizing data consisting of k-dimensional subspaces of n-dimensional Euclidean space. We illustrate the resulting algorithm on a hyperspectral imaging application. 
    more » « less
  2. For a set P of n points in the unit ball b ⊆ R d , consider the problem of finding a small subset T ⊆ P such that its convex-hull ε-approximates the convex-hull of the original set. Specifically, the Hausdorff distance between the convex hull of T and the convex hull of P should be at most ε. We present an efficient algorithm to compute such an ε ′ -approximation of size kalg, where ε ′ is a function of ε, and kalg is a function of the minimum size kopt of such an ε-approximation. Surprisingly, there is no dependence on the dimension d in either of the bounds. Furthermore, every point of P can be ε- approximated by a convex-combination of points of T that is O(1/ε2 )-sparse. Our result can be viewed as a method for sparse, convex autoencoding: approximately representing the data in a compact way using sparse combinations of a small subset T of the original data. The new algorithm can be kernelized, and it preserves sparsity in the original input. 
    more » « less
  3. Finding prototypes (e.g., mean and median) for a dataset is central to a number of common machine learning algorithms. Subspaces have been shown to provide useful, robust representations for datasets of images, videos and more. Since subspaces correspond to points on a Grassmann manifold, one is led to consider the idea of a subspace prototype for a Grassmann-valued dataset. While a number of different subspace prototypes have been described, the calculation of some of these prototypes has proven to be computationally expensive while other prototypes are affected by outliers and produce highly imperfect clustering on noisy data. This work proposes a new subspace prototype, the flag median, and introduces the FlagIRLS algorithm for its calculation. We provide evidence that the flag median is robust to outliers and can be used effectively in algorithms like Linde-Buzo-Grey (LBG) to produce improved clusterings on Grassmannians. Numerical experiments include a synthetic dataset, the MNIST handwritten digits dataset, the Mind's Eye video dataset and the UCF YouTube action dataset. The flag median is compared the other leading algorithms for computing prototypes on the Grassmannian, namely, the l_2-median and to the flag mean. We find that using FlagIRLS to compute the flag median converges in 4 iterations on a synthetic dataset. We also see that Grassmannian LBG with a codebook size of 20 and using the flag median produces at least a 10% improvement in cluster purity over Grassmannian LBG using the flag mean or l_2-median on the Mind's Eye dataset. 
    more » « less
  4. We extend the K-means and LBG algorithms to the framework of the Grassmann manifold to perform subspace quantization. For K-means it is possible to move a subspace in the direction of another using Grassmannian geodesics. For LBG the centroid computation is now done using a flag mean algorithm for averaging points on the Grassmannian. The resulting unsupervised algorithms are applied to the MNIST digit data set and the AVIRIS Indian Pines hyperspectral data set. 
    more » « less
  5. We extend the K-means and LBG algorithms to the framework of the Grassmann manifold to perform subspace quantization. For K-means it is possible to move a subspace in the direction of another using Grassmannian geodesics. For LBG the centroid computation is now done using a flag mean algorithm for averaging points on the Grassmannian. The resulting unsupervised algorithms are applied to the MNIST digit data set and the AVIRIS Indian Pines hyperspectral data set. 
    more » « less