skip to main content


Title: Learning Low-Dimensional Temporal Representations with Latent Alignments
Low-dimensional discriminative representations enhance machine learning methods in both performance and complexity. This has motivated supervised dimensionality reduction (DR), which transforms high-dimensional data into a discriminative subspace. Most DR methods require data to be i.i.d. However, in some domains, data naturally appear in sequences, where the observations are temporally correlated. We propose a DR method, namely, latent temporal linear discriminant analysis (LT-LDA), to learn low-dimensional temporal representations. We construct the separability among sequence classes by lifting the holistic temporal structures, which are established based on temporal alignments and may change in different subspaces. We jointly learn the subspace and the associated latent alignments by optimizing an objective that favors easily separable temporal structures. We show that this objective is connected to the inference of alignments and thus allows for an iterative solution. We provide both theoretical insight and empirical evaluations on several real-world sequence datasets to show the applicability of our method.  more » « less
Award ID(s):
1815561
NSF-PAR ID:
10105065
Author(s) / Creator(s):
;
Date Published:
Journal Name:
IEEE Transactions on Pattern Analysis and Machine Intelligence
ISSN:
0162-8828
Page Range / eLocation ID:
1 to 1
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Supervised dimensionality reduction for sequence data learns a transformation that maps the observations in sequences onto a low-dimensional subspace by maximizing the separability of sequences in different classes. It is typically more challenging than conventional dimensionality reduction for static data, because measuring the separability of sequences involves non-linear procedures to manipulate the temporal structures. In this paper, we propose a linear method, called Order-preserving Wasserstein Discriminant Analysis (OWDA), and its deep extension, namely DeepOWDA, to learn linear and non-linear discriminative subspace for sequence data, respectively. We construct novel separability measures between sequence classes based on the order-preserving Wasserstein (OPW) distance to capture the essential differences among their temporal structures. Specifically, for each class, we extract the OPW barycenter and construct the intra-class scatter as the dispersion of the training sequences around the barycenter. The inter-class distance is measured as the OPW distance between the corresponding barycenters. We learn the linear and non-linear transformations by maximizing the inter-class distance and minimizing the intra-class scatter. In this way, the proposed OWDA and DeepOWDA are able to concentrate on the distinctive differences among classes by lifting the geometric relations with temporal constraints. Experiments on four 3D action recognition datasets show the effectiveness of OWDA and DeepOWDA. 
    more » « less
  2. Abstract

    Objective.Learning dynamical latent state models for multimodal spiking and field potential activity can reveal their collective low-dimensional dynamics and enable better decoding of behavior through multimodal fusion. Toward this goal, developing unsupervised learning methods that are computationally efficient is important, especially for real-time learning applications such as brain–machine interfaces (BMIs). However, efficient learning remains elusive for multimodal spike-field data due to their heterogeneous discrete-continuous distributions and different timescales.Approach.Here, we develop a multiscale subspace identification (multiscale SID) algorithm that enables computationally efficient learning for modeling and dimensionality reduction for multimodal discrete-continuous spike-field data. We describe the spike-field activity as combined Poisson and Gaussian observations, for which we derive a new analytical SID method. Importantly, we also introduce a novel constrained optimization approach to learn valid noise statistics, which is critical for multimodal statistical inference of the latent state, neural activity, and behavior. We validate the method using numerical simulations and with spiking and local field potential population activity recorded during a naturalistic reach and grasp behavior.Main results.We find that multiscale SID accurately learned dynamical models of spike-field signals and extracted low-dimensional dynamics from these multimodal signals. Further, it fused multimodal information, thus better identifying the dynamical modes and predicting behavior compared to using a single modality. Finally, compared to existing multiscale expectation-maximization learning for Poisson–Gaussian observations, multiscale SID had a much lower training time while being better in identifying the dynamical modes and having a better or similar accuracy in predicting neural activity and behavior.Significance.Overall, multiscale SID is an accurate learning method that is particularly beneficial when efficient learning is of interest, such as for online adaptive BMIs to track non-stationary dynamics or for reducing offline training time in neuroscience investigations.

     
    more » « less
  3. To learn intrinsic low-dimensional structures from high-dimensional data that most discriminate between classes, we propose the principle of Maximal Coding Rate Reduction (MCR2), an information-theoretic measure that maximizes the coding rate difference between the whole dataset and the sum of each individual class. We clarify its relationships with most existing frameworks such as cross-entropy, information bottleneck, information gain, contractive and contrastive learning, and provide theoretical guarantees for learning diverse and discriminative features. The coding rate can be accurately computed from finite samples of degenerate subspace-like distributions and can learn intrinsic representations in supervised, self-supervised, and unsupervised settings in a unified manner. Empirically, the representations learned using this principle alone are significantly more robust to label corruptions in classification than those using cross-entropy, and can lead to state-of-the-art results in clustering mixed data from self-learned invariant features. 
    more » « less
  4. Abstract Background

    In Alzheimer’s Diseases (AD) research, multimodal imaging analysis can unveil complementary information from multiple imaging modalities and further our understanding of the disease. One application is to discover disease subtypes using unsupervised clustering. However, existing clustering methods are often applied to input features directly, and could suffer from the curse of dimensionality with high-dimensional multimodal data. The purpose of our study is to identify multimodal imaging-driven subtypes in Mild Cognitive Impairment (MCI) participants using a multiview learning framework based on Deep Generalized Canonical Correlation Analysis (DGCCA), to learn shared latent representation with low dimensions from 3 neuroimaging modalities.

    Results

    DGCCA applies non-linear transformation to input views using neural networks and is able to learn correlated embeddings with low dimensions that capture more variance than its linear counterpart, generalized CCA (GCCA). We designed experiments to compare DGCCA embeddings with single modality features and GCCA embeddings by generating 2 subtypes from each feature set using unsupervised clustering. In our validation studies, we found that amyloid PET imaging has the most discriminative features compared with structural MRI and FDG PET which DGCCA learns from but not GCCA. DGCCA subtypes show differential measures in 5 cognitive assessments, 6 brain volume measures, and conversion to AD patterns. In addition, DGCCA MCI subtypes confirmed AD genetic markers with strong signals that existing late MCI group did not identify.

    Conclusion

    Overall, DGCCA is able to learn effective low dimensional embeddings from multimodal data by learning non-linear projections. MCI subtypes generated from DGCCA embeddings are different from existing early and late MCI groups and show most similarity with those identified by amyloid PET features. In our validation studies, DGCCA subtypes show distinct patterns in cognitive measures, brain volumes, and are able to identify AD genetic markers. These findings indicate the promise of the imaging-driven subtypes and their power in revealing disease structures beyond early and late stage MCI.

     
    more » « less
  5. Networked data involve complex information from multifaceted channels, including topology structures, node content, and/or node labels etc., where structure and content are often correlated but are not always consistent. A typical scenario is the citation relationships in scholarly publications where a paper is cited by others not because they have the same content, but because they share one or multiple subject matters. To date, while many network embedding methods exist to take the node content into consideration, they all consider node content as simple flat word/attribute set and nodes sharing connections are assumed to have dependency with respect to all words or attributes. In this paper, we argue that considering topic-level semantic interactions between nodes is crucial to learn discriminative node embedding vectors. In order to model pairwise topic relevance between linked text nodes, we propose topical network embedding, where interactions between nodes are built on the shared latent topics. Accordingly, we propose a unified optimization framework to simultaneously learn topic and node representations from the network text contents and structures, respectively. Meanwhile, the structure modeling takes the learned topic representations as conditional context under the principle that two nodes can infer each other contingent on the shared latent topics. Experiments on three real-world datasets demonstrate that our approach can learn significantly better network representations, i.e., 4.1% improvement over the state-of-the-art methods in terms of Micro-F1 on Cora dataset. (The source code of the proposed method is available through the github link: https:// github.com/codeshareabc/TopicalNE.) 
    more » « less