skip to main content


Title: Linear and Deep Order-Preserving Wasserstein Discriminant Analysis
Supervised dimensionality reduction for sequence data learns a transformation that maps the observations in sequences onto a low-dimensional subspace by maximizing the separability of sequences in different classes. It is typically more challenging than conventional dimensionality reduction for static data, because measuring the separability of sequences involves non-linear procedures to manipulate the temporal structures. In this paper, we propose a linear method, called Order-preserving Wasserstein Discriminant Analysis (OWDA), and its deep extension, namely DeepOWDA, to learn linear and non-linear discriminative subspace for sequence data, respectively. We construct novel separability measures between sequence classes based on the order-preserving Wasserstein (OPW) distance to capture the essential differences among their temporal structures. Specifically, for each class, we extract the OPW barycenter and construct the intra-class scatter as the dispersion of the training sequences around the barycenter. The inter-class distance is measured as the OPW distance between the corresponding barycenters. We learn the linear and non-linear transformations by maximizing the inter-class distance and minimizing the intra-class scatter. In this way, the proposed OWDA and DeepOWDA are able to concentrate on the distinctive differences among classes by lifting the geometric relations with temporal constraints. Experiments on four 3D action recognition datasets show the effectiveness of OWDA and DeepOWDA.  more » « less
Award ID(s):
1815561
NSF-PAR ID:
10279302
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
IEEE Transactions on Pattern Analysis and Machine Intelligence
ISSN:
0162-8828
Page Range / eLocation ID:
1 to 1
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Supervised dimensionality reduction for sequence data projects the observations in sequences onto a low-dimensional subspace to better separate different sequence classes. It is typically more challenging than conventional dimensionality reduction for static data, because measuring the separability of sequences involves non-linear procedures to manipulate the temporal structures. This paper presents a linear method, namely Order-preserving Wasserstein Discriminant Analysis (OWDA), which learns the projection by maximizing the inter-class distance and minimizing the intra-class scatter. For each class, OWDA extracts the order-preserving Wasserstein barycenter and constructs the intra-class scatter as the dispersion of the training sequences around the barycenter. The inter-class distance is measured as the order-preserving Wasserstein distance between the corresponding barycenters. OWDA is able to concentrate on the distinctive differences among classes by lifting the geometric relations with temporal constraints. Experiments show that OWDA achieves competitive results on three 3D action recognition datasets. 
    more » « less
  2. Low-dimensional discriminative representations enhance machine learning methods in both performance and complexity. This has motivated supervised dimensionality reduction (DR), which transforms high-dimensional data into a discriminative subspace. Most DR methods require data to be i.i.d. However, in some domains, data naturally appear in sequences, where the observations are temporally correlated. We propose a DR method, namely, latent temporal linear discriminant analysis (LT-LDA), to learn low-dimensional temporal representations. We construct the separability among sequence classes by lifting the holistic temporal structures, which are established based on temporal alignments and may change in different subspaces. We jointly learn the subspace and the associated latent alignments by optimizing an objective that favors easily separable temporal structures. We show that this objective is connected to the inference of alignments and thus allows for an iterative solution. We provide both theoretical insight and empirical evaluations on several real-world sequence datasets to show the applicability of our method. 
    more » « less
  3. Persistent Homology (PH) is a method of Topological Data Analysis that analyzes the topological structure of data to help data scientists infer relationships in the data to assist in informed decision- making. A significant component in the computation of PH is the construction and use of a complex that represents the topological structure of the data. Some complex types are fast to construct but space inefficient whereas others are costly to construct and space efficient. Unfortunately, existing complex types are not both fast to construct and compact. This paper works to increase the scope of PH to support the computation of low dimensional homologies (H0 –H10 ) in high-dimension, big data. In particular, this paper exploits the desirable properties of the Vietoris–Rips Complex (VR-Complex) and the Delaunay Complex in order to construct a sparsified complex. The VR-Complex uses a distance matrix to quickly generate a complex up to the desired homology dimension. In contrast, the Delaunay Complex works at the dimensionality of the data to generate a sparsified complex. While construction of the VR-Complex is fast, its size grows exponentially by the size and dimension of the data set; in contrast, the Delaunay complex is significantly smaller for any given data dimension. However, its construction requires the computation of a Delaunay Triangulation that has high computational complexity. As a result, it is difficult to construct a Delaunay Complex for data in dimensions d > 6 that contains more than a few hundred points. The techniques in this paper enable the computation of topological preserving sparsification of k-Simplices (where k ≪ d) to quickly generate a reduced sparsified complex sufficient to compute homologies up to k-subspace, irrespective of the data dimensionality d. 
    more » « less
  4. Most applications of multispectral imaging are explicitly or implicitly dependent on the dimensionality and topology of the spectral mixing space. Mixing space characterization refers to the identification of salient properties of the set of pixel reflectance spectra comprising an image (or compilation of images). The underlying premise is that this set of spectra may be described as a low dimensional manifold embedded in a high dimensional vector space. Traditional mixing space characterization uses the linear dimensionality reduction offered by Principal Component Analysis to find projections of pixel spectra onto orthogonal linear subspaces, prioritized by variance. Here, we consider the potential for recent advances in nonlinear dimensionality reduction (specifically, manifold learning) to contribute additional useful information for multispectral mixing space characterization. We integrate linear and nonlinear methods through a novel approach called Joint Characterization (JC). JC is comprised of two components. First, spectral mixture analysis (SMA) linearly projects the high-dimensional reflectance vectors onto a 2D subspace comprising the primary mixing continuum of substrates, vegetation, and dark features (e.g., shadow and water). Second, manifold learning nonlinearly maps the high-dimensional reflectance vectors into a low-D embedding space while preserving manifold topology. The SMA output is physically interpretable in terms of material abundances. The manifold learning output is not generally physically interpretable, but more faithfully preserves high dimensional connectivity and clustering within the mixing space. Used together, the strengths of SMA may compensate for the limitations of manifold learning, and vice versa. Here, we illustrate JC through application to thematic compilations of 90 Sentinel-2 reflectance images selected from a diverse set of biomes and land cover categories. Specifically, we use globally standardized Substrate, Vegetation, and Dark (S, V, D) endmembers (EMs) for SMA, and Uniform Manifold Approximation and Projection (UMAP) for manifold learning. The value of each (SVD and UMAP) model is illustrated, both separately and jointly. JC is shown to successfully characterize both continuous gradations (spectral mixing trends) and discrete clusters (land cover class distinctions) within the spectral mixing space of each land cover category. These features are not clearly identifiable from SVD fractions alone, and not physically interpretable from UMAP alone. Implications are discussed for the design of models which can reliably extract and explainably use high-dimensional spectral information in spatially mixed pixels—a principal challenge in optical remote sensing.

     
    more » « less
  5. null (Ed.)
    We consider the problem of estimating the Wasserstein distance between the empirical measure and a set of probability measures whose expectations over a class of functions (hypothesis class) are constrained. If this class is sufficiently rich to characterize a particular distribution (e.g., all Lipschitz functions), then our formulation recovers the Wasserstein distance to such a distribution. We establish a strong duality result that generalizes the celebrated Kantorovich-Rubinstein duality. We also show that our formulation can be used to beat the curse of dimensionality, which is well known to affect the rates of statistical convergence of the empirical Wasserstein distance. In particular, examples of infinite-dimensional hypothesis classes are presented, informed by a complex correlation structure, for which it is shown that the empirical Wasserstein distance to such classes converges to zero at the standard parametric rate. Our formulation provides insights that help clarify why, despite the curse of dimensionality, the Wasserstein distance enjoys favorable empirical performance across a wide range of statistical applications. 
    more » « less