skip to main content


Title: Multi-Label Multi-Task Learning with Dynamic Task Weight Balancing
Data collected from real-world environments often contain multiple objects, scenes, and activities. In comparison to single-label problems, where each data sample only defines one concept, multi-label problems allow the co-existence of multiple concepts. To exploit the rich semantic information in real-world data, multi-label classification has seen many applications in a variety of domains. The traditional approaches to multi-label problems tend to have the side effects of increased memory usage, slow model inference speed, and most importantly the under-utilization of the dependency across concepts. In this paper, we adopt multi-task learning to address these challenges. Multi-task learning treats the learning of each concept as a separate job, while at the same time leverages the shared representations among all tasks. We also propose a dynamic task balancing method to automatically adjust the task weight distribution by taking both sample-level and task-level learning complexities into consideration. Our framework is evaluated on a disaster video dataset and the performance is compared with several state-of-the-art multi-label and multi-task learning techniques. The results demonstrate the effectiveness and supremacy of our approach.  more » « less
Award ID(s):
1952089
NSF-PAR ID:
10233987
Author(s) / Creator(s):
;
Date Published:
Journal Name:
2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI)
Page Range / eLocation ID:
245 to 252
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Predicting the occurrence of a particular event of interest at future time points is the primary goal of survival analysis. The presence of incomplete observations due to time limitations or loss of data traces is known as censoring which brings unique challenges in this domain and differentiates survival analysis from other standard regression methods. The popularly used survival analysis methods such as Cox proportional hazard model and parametric survival regression suffer from some strict assumptions and hypotheses that are not realistic in most of the real-world applications. To overcome the weaknesses of these two types of methods, in this paper, we reformulate the survival analysis problem as a multi-task learning problem and propose a new multi-task learning based formulation to predict the survival time by estimating the survival status at each time interval during the study duration. We propose an indicator matrix to enable the multi-task learning algorithm to handle censored instances and incorporate some of the important characteristics of survival problems such as non-negative non-increasing list structure into our model through max-heap projection. We employ the L2,1-norm penalty which enables the model to learn a shared representation across related tasks and hence select important features and alleviate over-fitting in high-dimensional feature spaces; thus, reducing the prediction error of each task. To efficiently handle the two non-smooth constraints, in this paper, we propose an optimization method which employs Alternating Direction Method of Multipliers (ADMM) algorithm to solve the proposed multi-task learning problem. We demonstrate the performance of the proposed method using real-world microarray gene expression high-dimensional benchmark datasets and show that our method outperforms state-of-the-art methods. 
    more » « less
  2. Machine learning on graph structured data has attracted much research interest due to its ubiquity in real world data. However, how to efficiently represent graph data in a general way is still an open problem. Traditional methods use handcraft graph features in a tabular form but suffer from the defects of domain expertise requirement and information loss. Graph representation learning overcomes these defects by automatically learning the continuous representations from graph structures, but they require abundant training labels, which are often hard to fulfill for graph-level prediction problems. In this work, we demonstrate that, if available, the domain expertise used for designing handcraft graph features can improve the graph-level representation learning when training labels are scarce. Specifically, we proposed a multi-task knowledge distillation method. By incorporating network-theory-based graph metrics as auxiliary tasks, we show on both synthetic and real datasets that the proposed multi-task learning method can improve the prediction performance of the original learning task, especially when the training data size is small. 
    more » « less
  3. null (Ed.)
    Clustering is a machine learning paradigm of dividing sample subjects into a number of groups such that subjects in the same groups are more similar to those in other groups. With advances in information acquisition technologies, samples can frequently be viewed from different angles or in different modalities, generating multi-view data. Multi-view clustering, that clusters subjects into subgroups using multi-view data, has attracted more and more attentions. Although MVC methods have been developed rapidly, there has not been enough survey to summarize and analyze the current progress. Therefore, we propose a novel taxonomy of the MVC approaches. Similar with machine learning methods, we categorize them into generative and discriminative classes. In discriminative class, based on the way to integrate multiple views, we split it further into five groups: Common Eigenvector Matrix, Common Coefficient Matrix, Common Indicator Matrix, Direct Combination and Combination After Projection. Furthermore, we discuss the relationships between MVC and some related topics: multi-view representation, ensemble clustering, multi-task clustering, multi-view supervised and semi-supervised learning. Several representative real-world applications are elaborated for practitioners. Some commonly used multi-view datasets are introduced and several representative MVC algorithms from each group are run to conduct the comparison to analyze how and why they perform on those datasets. To promote future development of MVC approaches, we point out several open problems that may require further investigation and thorough examination. 
    more » « less
  4. Survival analysis aims at predicting time to event of interest along with its probability on longitudinal data. It is commonly used to make predictions for a single specific event of interest at a given time point. However, predicting the occurrence of multiple events simultaneously and dynamically is needed in many applications. An intuitive way to solve this problem is to simply apply the regular survival analysis method independently to each task at each time point. However, it often leads to a suboptimal solution since the underlying dependencies between tasks are ignored, which motivates us to analyze these tasks jointly to select common features shared across all tasks. In this paper, we formulate a temporal Multi-Task learning framework (MTMT) using tensor representation. More specifically, given a survival dataset and a sequence of time points, which are considered as the monitored time points, we model each task at each time point as a regular survival analysis problem and optimize them simultaneously. We demonstrate the performance of MTMT model on two real-world datasets. We show the superior performance of the MTMT model compared to several state-of-the-art models. We also provide the list of important features selected to demonstrate the interpretability of our model. 
    more » « less
  5. Abstract Motivation Detecting cancer gene expression and transcriptome changes with mRNA-sequencing (RNA-Seq) or array-based data are important for understanding the molecular mechanisms underlying carcinogenesis and cellular events during cancer progression. In previous studies, the differentially expressed genes were detected across patients in one cancer type. These studies ignored the role of mRNA expression changes in driving tumorigenic mechanisms that are either universal or specific in different tumor types. To address the problem, we introduce two network-based multi-task learning frameworks, NetML and NetSML, to discover common differentially expressed genes shared across different cancer types as well as differentially expressed genes specific to each cancer type. The proposed frameworks consider the common latent gene co-expression modules and gene-sample biclusters underlying the multiple cancer datasets to learn the knowledge crossing different tumor types. Results Large-scale experiments on simulations and real cancer high-throughput datasets validate that the proposed network-based multi-task learning frameworks perform better sample classification compared with the models without the knowledge sharing across different cancer types. The common and cancer specific molecular signatures detected by multi-task learning frameworks on TCGA ovarian cancer, breast cancer, and prostate cancer datasets are correlated with the known marker genes and enriched in cancer relevant KEGG pathways and Gene Ontology terms. Availability and Implementation Source code is available at: https://github.com/compbiolabucf/NetML Supplementary information Supplementary data are available at Bioinformatics 
    more » « less