NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Supervised Dimensionality Reduction and Visualization using Centroid-Encoder

Ghosh, Tomojit; Kirby, Michael (January 2022, Journal of machine learning research)

We propose a new tool for visualizing complex, and potentially large and high-dimensional, data sets called Centroid-Encoder (CE). The architecture of the Centroid-Encoder is similar to the autoencoder neural network but it has a modified target, i.e., the class centroid in the ambient space. As such, CE incorporates label information and performs a supervised data visualization. The training of CE is done in the usual way with a training set whose parameters are tuned using a validation set. The evaluation of the resulting CE visualization is performed on a sequestered test set where the generalization of the model is assessed both visually and quantitatively. We present a detailed comparative analysis of the method using a wide variety of data sets and techniques, both supervised and unsupervised, including NCA, non-linear NCA, t-distributed NCA, t-distributed MCML, supervised UMAP, supervised PCA, Colored Maximum Variance Unfolding, supervised Isomap, Parametric Embedding, supervised Neighbor Retrieval Visualizer, and Multiple Relational Embedding. An analysis of variance using PCA demonstrates that a non-linear preprocessing by the CE transformation of the data captures more variance than PCA by dimension.
more » « less
Full Text Available
Self-organizing mappings on the flag manifold with applications to hyper-spectral image data analysis

https://doi.org/10.1007/s00521-020-05579-y

Ma, Xiaofeng; Kirby, Michael; Peterson, Chris (March 2021, Neural Computing and Applications)

A flag is a nested sequence of vector spaces. The type of the flag encodes the sequence of dimensions of the vector spaces making up the flag. A flag manifold is a manifold whose points parameterize all flags of a fixed type in a fixed vector space. This paper provides the mathematical framework necessary for implementing self-organizing mappings on flag manifolds. Flags arise implicitly in many data analysis contexts including wavelet, Fourier, and singular value decompositions. The proposed geometric framework in this paper enables the computation of distances between flags, the computation of geodesics between flags, and the ability to move one flag a prescribed distance in the direction of another flag. Using these operations as building blocks, we implement the SOM algorithm on a flag manifold. The basic algorithm is applied to the problem of parameterizing a set of flags of a fixed type.
more » « less
Full Text Available
Local eigenvalue decomposition for embedded Riemannian manifolds

https://doi.org/10.1016/j.laa.2020.06.006

Álvarez-Vizoso, Javier; Kirby, Michael; Peterson, Chris (November 2020, Linear Algebra and its Applications)

Full Text Available
Manifold curvature learning from hypersurface integral invariants

https://doi.org/10.1016/j.laa.2020.05.020

Álvarez-Vizoso, Javier; Kirby, Michael; Peterson, Chris (October 2020, Linear Algebra and its Applications)

Full Text Available
Exploring Musical Structure Using Tonnetz Lattice Geometry and LSTMs

Manuchehr, Aminian; Kehoe, Eric; Ma, Xiaofeng; Peterson, Amy; Kirby, Michael (June 2020, Lecture Notes in Computer Science)
null (Ed.)
Full Text Available
Error-adaptive modeling of streaming time-series data using radial basis functions

https://doi.org/10.1016/j.cam.2018.10.056

Ma, Xiaofeng; Aminian, Manuchehr; Kirby, Michael (December 2019, Journal of Computational and Applied Mathematics)
null (Ed.)
Full Text Available
Geometry of curves in $R^{n}$ from the local singular value decomposition

https://doi.org/10.1016/j.laa.2019.02.006

Álvarez-Vizoso, J.; Arn, Robert; Kirby, Michael; Peterson, Chris; Draper, Bruce (June 2019, Linear Algebra and its Applications)

Full Text Available
Self-organizing mappings on the Grassmannian with applications to data analysis in high dimensions

https://doi.org/10.1007/s00521-019-04444-x

Ma, Xiaofeng; Kirby, Michael; Peterson, Chris; Scharf, Louis (January 2019, Neural Computing and Applications)

Full Text Available
Monitoring the shape of weather, soundscapes, and dynamical systems: a new statistic for dimension-driven data analysis on large datasets

https://doi.org/10.1109/BigData.2018.8622365

Kvinge, Henry; Farnell, Elin; Kirby, Michael; Peterson, Chris (December 2018, 2018 IEEE International Conference on Big Data (Big Data))

Dimensionality-reduction methods are a fundamental tool in the analysis of large datasets. These algorithms work on the assumption that the "intrinsic dimension" of the data is generally much smaller than the ambient dimension in which it is collected. Alongside their usual purpose of mapping data into a smaller-dimensional space with minimal information loss, dimensionality-reduction techniques implicitly or explicitly provide information about the dimension of the dataset.In this paper, we propose a new statistic that we call the kappa-profile for analysis of large datasets. The kappa-profile arises from a dimensionality-reduction optimization problem: namely that of finding a projection that optimally preserves the secants between points in the dataset. From this optimal projection we extract kappa, the norm of the shortest projected secant from among the set of all normalized secants. This kappa can be computed for any dimension k; thus the tuple of kappa values (indexed by dimension) becomes a kappa-profile. Algorithms such as the Secant-Avoidance Projection algorithm and the Hierarchical Secant-Avoidance Projection algorithm provide a computationally feasible means of estimating the kappa-profile for large datasets, and thus a method of understanding and monitoring their behavior. As we demonstrate in this paper, the kappa-profile serves as a useful statistic in several representative settings: weather data, soundscape data, and dynamical systems data.
more » « less
Full Text Available
Too many secants: a hierarchical approach to secant-based dimensionality reduction on large data sets

https://doi.org/10.1109/HPEC.2018.8547515

Kvinge, Henry; Farnell, Elin; Kirby, Michael; Peterson, Chris (September 2018, 2018 IEEE High Performance extreme Computing Conference (HPEC))

A fundamental question in many data analysis settings is the problem of discerning the “natural” dimension of a data set. That is, when a data set is drawn from a manifold (possibly with noise), a meaningful aspect of the data is the dimension of that manifold. Various approaches exist for estimating this dimension, such as the method of Secant-Avoidance Projection (SAP). Intuitively, the SAP algorithm seeks to determine a projection which best preserves the lengths of all secants between points in a data set; by applying the algorithm to find the best projections to vector spaces of various dimensions, one may infer the dimension of the manifold of origination. That is, one may learn the dimension at which it is possible to construct a diffeomorphic copy of the data in a lower-dimensional Euclidean space. Using Whitney's embedding theorem, we can relate this information to the natural dimension of the data. A drawback of the SAP algorithm is that a data set with T points has O(T 2 ) secants, making the computation and storage of all secants infeasible for very large data sets. In this paper, we propose a novel algorithm that generalizes the SAP algorithm with an emphasis on addressing this issue. That is, we propose a hierarchical secant-based dimensionality-reduction method, which can be employed for data sets where explicitly calculating all secants is not feasible.
more » « less
Full Text Available

« Prev Next »

Search for: All records