Discriminative Topic Mining via Category-Name Guided Text Embedding

Meng, Yu; Huang, Jiaxin; Wang, Guangyuan; Wang, Zihan; Zhang, Chao; Zhang, Yu; Han, Jiawei

doi:10.1145/3366423.3380278

Citation Details

Discriminative Topic Mining via Category-Name Guided Text Embedding

Mining a set of meaningful and distinctive topics automatically from massive text corpora has broad applications. Existing topic models, however, typically work in a purely unsupervised way, which often generate topics that do not fit users’ particular needs and yield suboptimal performance on downstream tasks. We propose a new task, discriminative topic mining, which leverages a set of user-provided category names to mine discriminative topics from text corpora. This new task not only helps a user understand clearly and distinctively the topics he/she is most interested in, but also benefits directly keyword-driven classification tasks. We develop CatE, a novel category-name guided text embedding method for discriminative topic mining, which effectively leverages minimal user guidance to learn a discriminative embedding space and discover category representative terms in an iterative manner. We conduct a comprehensive set of experiments to show that CatE mines highquality set of topics guided by category names only, and benefits a variety of downstream applications including weakly-supervised classification and lexical entailment direction identification. more »

Award ID(s):: 1741317 1618481 1704532

PAR ID:: 10160118

Author(s) / Creator(s):: Meng, Yu; Huang, Jiaxin; Wang, Guangyuan; Wang, Zihan; Zhang, Chao; Zhang, Yu; Han, Jiawei

Date Published:: 2020-04-19

Journal Name:: WWW '20: The Web Conference 2020

Volume:: 1

Issue:: 1

Page Range / eLocation ID:: 2121 to 2132

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3366423.3380278

More Like this