skip to main content

Title: CTRL: Closed-Loop Transcription to an LDR via Minimaxing Rate Reduction
This work proposes a new computational framework for learning a structured generative model for real-world datasets. In particular, we propose to learn a Closed-loop Transcriptionbetween a multi-class, multi-dimensional data distribution and a Linear discriminative representation (CTRL) in the feature space that consists of multiple independent multi-dimensional linear subspaces. In particular, we argue that the optimal encoding and decoding mappings sought can be formulated as a two-player minimax game between the encoder and decoderfor the learned representation. A natural utility function for this game is the so-called rate reduction, a simple information-theoretic measure for distances between mixtures of subspace-like Gaussians in the feature space. Our formulation draws inspiration from closed-loop error feedback from control systems and avoids expensive evaluating and minimizing of approximated distances between arbitrary distributions in either the data space or the feature space. To a large extent, this new formulation unifies the concepts and benefits of Auto-Encoding and GAN and naturally extends them to the settings of learning a both discriminative and generative representation for multi-class and multi-dimensional real-world data. Our extensive experiments on many benchmark imagery datasets demonstrate tremendous potential of this new closed-loop formulation: under fair comparison, visual quality of the learned decoder and classification performance more » of the encoder is competitive and arguably better than existing methods based on GAN, VAE, or a combination of both. Unlike existing generative models, the so-learned features of the multiple classes are structured instead of hidden: different classes are explicitly mapped onto corresponding independent principal subspaces in the feature space, and diverse visual attributes within each class are modeled by the independent principal components within each subspace. « less
; ; ; ; ; ; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Page Range or eLocation-ID:
Sponsoring Org:
National Science Foundation
More Like this
  1. Learning discriminative node representations benefits various downstream tasks in graph analysis such as community detection and node classification. Existing graph representation learning methods (e.g., based on random walk and contrastive learning) are limited to maximizing the local similarity of connected nodes. Such pair-wise learning schemes could fail to capture the global distribution of representations, since it has no explicit constraints on the global geometric properties of representation space. To this end, we propose Geometric Graph Representation Learning (G2R) to learn node representations in an unsupervised manner via maximizing rate reduction. In this way, G2R maps nodes in distinct groups (implicitly stored in the adjacency matrix) into different subspaces, while each subspace is compact and different subspaces are dispersedly distributed. G2R adopts a graph neural network as the encoder and maximizes the rate reduction with the adjacency matrix. Furthermore, we theoretically and empirically demonstrate that rate reduction maximization is equivalent to maximizing the principal angles between different subspaces. Experiments on real-world datasets show that G2R outperforms various baselines on node classification and community detection tasks.
  2. Convolutional neural networks (CNNs) have become a key asset to most of fields in AI. Despite their successful performance, CNNs suffer from a major drawback. They fail to capture the hierarchy of spatial relation among different parts of an entity. As a remedy to this problem, the idea of capsules was proposed by Hinton. In this paper, we propose the SubSpace Capsule Network (SCN) that exploits the idea of capsule networks to model possible variations in the appearance or implicitly-defined properties of an entity through a group of capsule subspaces instead of simply grouping neurons to create capsules. A capsule is created by projecting an input feature vector from a lower layer onto the capsule subspace using a learnable transformation. This transformation finds the degree of alignment of the input with the properties modeled by the capsule subspace.We show that SCN is a general capsule network that can successfully be applied to both discriminative and generative models without incurring computational overhead compared to CNN during test time. Effectiveness of SCN is evaluated through a comprehensive set of experiments on supervised image classification, semi-supervised image classification and high-resolution image generation tasks using the generative adversarial network (GAN) framework. SCN significantly improvesmore »the performance of the baseline models in all 3 tasks.« less
  3. A spatial accelerator’s efficiency depends heavily on both its mapper and cost models to generate optimized mappings for various operators of DNN models. However, existing cost models lack a formal boundary over their input programs (operators) for accurate and tractable cost analysis of the mappings, and this results in adaptability challenges to the cost models for new operators. We consider the recently introduced Maestro Data-Centric (MDC) notation and its analytical cost model to address this challenge because any mapping expressed in the notation is precisely analyzable using the MDC’s cost model. In this article, we characterize the set of input operators and their mappings expressed in the MDC notation by introducing a set of conformability rules . The outcome of these rules is that any loop nest that is perfectly nested with affine tensor subscripts and without conditionals is conformable to the MDC notation. A majority of the primitive operators in deep learning are such loop nests. In addition, our rules enable us to automatically translate a mapping expressed in the loop nest form to MDC notation and use the MDC’s cost model to guide upstream mappers. Our conformability rules over the input operators result in a structured mapping spacemore »of the operators, which enables us to introduce a mapper based on our decoupled off-chip/on-chip approach to accelerate mapping space exploration. Our mapper decomposes the original higher-dimensional mapping space of operators into two lower-dimensional off-chip and on-chip subspaces and then optimizes the off-chip subspace followed by the on-chip subspace. We implemented our overall approach in a tool called Marvel , and a benefit of our approach is that it applies to any operator conformable with the MDC notation. We evaluated Marvel over major DNN operators and compared it with past optimizers.« less
  4. Soft biometric modalities have shown their utility in different applications including reducing the search space significantly. This leads to improved recognition performance, reduced computation time, and faster processing of test samples. Some common soft biometric modalities are ethnicity, gender, age, hair color, iris color, presence of facial hair or moles, and markers. This research focuses on performing ethnicity and gender classification on iris images. We present a novel supervised autoencoder based approach, Deep Class-Encoder, which uses class labels to learn discriminative representation for the given sample by mapping the learned feature vector to its label. The proposed model is evaluated on two datasets each for ethnicity and gender classification. The results obtained using the proposed Deep Class-Encoder demonstrate its effectiveness in comparison to existing approaches and state-of-the-art methods.
  5. We extend the self-organizing mapping algorithm to the problem of visualizing data on Grassmann manifolds. In this setting, a collection of k points in n-dimensions is represented by a k-dimensional subspace, e.g., via the singular value or QR-decompositions. Data assembled in this way is challenging to visualize given abstract points on the Grassmannian do not reside in Euclidean space. The extension of the SOM algorithm to this geometric setting only requires that distances between two points can be measured and that any given point can be moved towards a presented pattern. The similarity between two points on the Grassmannian is measured in terms of the principal angles between subspaces, e.g., the chordal distance. Further, we employ a formula for moving one subspace towards another along the shortest path, i.e., the geodesic between two points on the Grassmannian. This enables a faithful implementation of the SOM approach for visualizing data consisting of k-dimensional subspaces of n-dimensional Euclidean space. We illustrate the resulting algorithm on a hyperspectral imaging application.