skip to main content


Title: Modeling Label Space Interactions in Multi-label Classification using Box Embeddings
Multi-label classification is a challenging structured prediction task in which a set of output class labels are predicted for each input. Real-world datasets often have natural or latent taxonomic relationships between labels, making it desirable for models to employ label representations capable of capturing such taxonomies. Most existing multi-label classification methods do not do so, resulting in label predictions that are inconsistent with the taxonomic constraints, thus failing to accurately represent the fundamentals of problem setting. In this work, we introduce the multi-label box model (MBM), a multi-label classification method that combines the encoding power of neural networks with the inductive bias and probabilistic semantics of box embeddings (Vilnis, et al 2018). Box embeddings can be understood as trainable Venn-diagrams based on hyper-rectangles. Representing labels by boxes rather than vectors, MBM is able to capture taxonomic relations among labels. Furthermore, since box embeddings allow these relations to be learned by stochastic gradient descent from data, and to be read as calibrated conditional probabilities, our model is endowed with a high degree of interpretability. This interpretability also facilitates the injection of partial information about label-label relationships into model training, to further improve its consistency. We provide theoretical grounding for our method and show experimentally the model's ability to learn the true latent taxonomic structure from data. Through extensive empirical evaluations on both small and large-scale multi-label classification datasets, we show that BBM can significantly improve taxonomic consistency while preserving or surpassing the state-of-the-art predictive performance.  more » « less
Award ID(s):
2106391
NSF-PAR ID:
10392233
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
ICLR 2022 Poster
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)

    Multi-label classification is the challenging task of predicting the presence and absence of multiple targets, involving representation learning and label correlation modeling. We propose a novel framework for multi-label classification, Multivariate Probit Variational AutoEncoder (MPVAE), that effectively learns latent embedding spaces as well as label correlations. MPVAE learns and aligns two probabilistic embedding spaces for labels and features respectively. The decoder of MPVAE takes in the samples from the embedding spaces and models the joint distribution of output targets under a Multivariate Probit model by learning a shared covariance matrix. We show that MPVAE outperforms the existing state-of-the-art methods on important computational sustainability applications as well as on other application domains, using public real-world datasets. MPVAE is further shown to remain robust under noisy settings. Lastly, we demonstrate the interpretability of the learned covariance by a case study on a bird observation dataset.

     
    more » « less
  2. Understanding how environmental characteristics affect bio- diversity patterns, from individual species to communities of species, is critical for mitigating effects of global change. A central goal for conservation planning and monitoring is the ability to accurately predict the occurrence of species com- munities and how these communities change over space and time. This in turn leads to a challenging and long-standing problem in the field of computer science - how to perform ac- curate multi-label classification with hundreds of labels? The key challenge of this problem is its exponential-sized output space with regards to the number of labels to be predicted. Therefore, it is essential to facilitate the learning process by exploiting correlations (or dependency) among labels. Previ- ous methods mostly focus on modelling the correlation on label pairs; however, complex relations between real-world objects often go beyond second order. In this paper, we pro- pose a novel framework for multi-label classification, High- order Tie-in Variational Autoencoder (HOT-VAE), which per- forms adaptive high-order label correlation learning. We ex- perimentally verify that our model outperforms the existing state-of-the-art approaches on a bird distribution dataset on both conventional F1 scores and a variety of ecological met- rics. To show our method is general, we also perform em- pirical analysis on seven other public real-world datasets in several application domains, and Hot-VAE exhibits superior performance to previous methods. 
    more » « less
  3. We explore the effect of auxiliary labels in improving the classification accuracy of wearable sensor-based human activity recognition (HAR) systems, which are primarily trained with the supervision of the activity labels (e.g. running, walking, jumping). Supplemental meta-data are often available during the data collection process such as body positions of the wearable sensors, subjects' demographic information (e.g. gender, age), and the type of wearable used (e.g. smartphone, smart-watch). This information, while not directly related to the activity classification task, can nonetheless provide auxiliary supervision and has the potential to significantly improve the HAR accuracy by providing extra guidance on how to handle the introduced sample heterogeneity from the change in domains (i.e positions, persons, or sensors), especially in the presence of limited activity labels. However, integrating such meta-data information in the classification pipeline is non-trivial - (i) the complex interaction between the activity and domain label space is hard to capture with a simple multi-task and/or adversarial learning setup, (ii) meta-data and activity labels might not be simultaneously available for all collected samples. To address these issues, we propose a novel framework Conditional Domain Embeddings (CoDEm). From the available unlabeled raw samples and their domain meta-data, we first learn a set of domain embeddings using a contrastive learning methodology to handle inter-domain variability and inter-domain similarity. To classify the activities, CoDEm then learns the label embeddings in a contrastive fashion, conditioned on domain embeddings with a novel attention mechanism, enforcing the model to learn the complex domain-activity relationships. We extensively evaluate CoDEm in three benchmark datasets against a number of multi-task and adversarial learning baselines and achieve state-of-the-art performance in each avenue. 
    more » « less
  4. Multi-label classification (MLC), which assigns multiple labels to each instance, is crucial to domains from computer vision to text mining. Conventional methods for MLC require huge amounts of labeled data to capture complex dependencies between labels. However, such labeled datasets are expensive, or even impossible, to acquire. Worse yet, these pre-trained MLC models can only be used for the particular label set covered in the training data. Despite this severe limitation, few methods exist for expanding the set of labels predicted by pre-trained models. Instead, we acquire vast amounts of new labeled data and retrain a new model from scratch. Here, we propose combining the knowledge from multiple pre-trained models (teachers) to train a new student model that covers the union of the labels predicted by this set of teachers. This student supports a broader label set than any one of its teachers without using labeled data. We call this new problem knowledge amalgamation for multi-label classification. Our new method, Adaptive KNowledge Transfer (ANT), trains a student by learning from each teacher’s partial knowledge of label dependencies to infer the global dependencies between all labels across the teachers. We show that ANT succeeds in unifying label dependencies among teachers, outperforming five state-of-the-art methods on eight real-world datasets. 
    more » « less
  5. A key assumption in multi-task learning is that at the inference time the multi-task model only has access to a given data point but not to the data point’s labels from other tasks. This presents an opportunity to extend multi-task learning to utilize data point’s labels from other auxiliary tasks, and this way improves performance on the new task. Here we introduce a novel relational multi-task learning setting where we leverage data point labels from auxiliary tasks to make more accurate predictions on the new task. We develop MetaLink, where our key innovation is to build a knowledge graph that connects data points and tasks and thus allows us to leverage labels from auxiliary tasks. The knowledge graph consists of two types of nodes: (1) data nodes, where node features are data embeddings computed by the neural network, and (2) task nodes, with the last layer’s weights for each task as node features. The edges in this knowledge graph capture data-task relationships, and the edge label captures the label of a data point on a particular task. Under MetaLink, we reformulate the new task as a link label prediction problem between a data node and a task node. The MetaLink framework provides flexibility to model knowledge transfer from auxiliary task labels to the task of interest. We evaluate MetaLink on 6 benchmark datasets in both biochemical and vision domains. Experiments demonstrate that MetaLink can successfully utilize the relations among different tasks, outperforming the state-of-the-art methods under the proposed relational multi-task learning setting, with up to 27% improvement in ROC AUC. 
    more » « less