NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Embedding-Driven Multi-Dimensional Topic Mining and Text Analysis

https://doi.org/10.1145/3394486.3406483

Meng, Yu; Huang, Jiaxin; Han, Jiawei (August 2020, KDD:20 The 26th {ACM} {SIGKDD} Conference on Knowledge Discovery and Data Mining)
null (Ed.)
Full Text Available
Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding

https://doi.org/10.1145/3394486.3403242

Meng, Yu; Zhang, Yunyi; Huang, Jiaxin; Zhang, Yu; Zhang, Chao; Han, Jiawei (July 2020, KDD:20 The 26th {ACM} {SIGKDD} Conference on Knowledge Discovery and Data Mining)
null (Ed.)
Full Text Available
AutoKnow: Self-Driving Knowledge Collection for Products of Thousands of Types

https://doi.org/10.1145/3394486.3403323

Dong, Xin Luna; He, Xiang; Kan, Andrey; Li, Xian; Liang, Yan; Ma, Jun; Xu, Yifan Ethan; Zhang, Chenwei; Zhao, Tong; Blanco Saldana, Gabriel; et al (July 2020, KDD:20 The 26th {ACM} {SIGKDD} Conference on Knowledge Discovery and Data Mining)
null (Ed.)
Full Text Available
CoRel: Seed-Guided Topical Taxonomy Construction by Concept Learning and Relation Transferring

https://doi.org/10.1145/3394486.3403244

Huang, Jiaxin; Xie, Yiqing; Meng, Yu; Zhang, Yunyi; Han, Jiawei (July 2020, KDD:20 The 26th {ACM} {SIGKDD} Conference on Knowledge Discovery and Data Mining)
null (Ed.)
Full Text Available
When Do GNNs Work: Understanding and Improving Neighborhood Aggregation

https://doi.org/10.24963/ijcai.2020/181

Xie, Yiqing; Li, Sha; Yang, Carl; Wong, Raymond Chi-Wing; Han, Jiawei (July 2020, IJCAI'20: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, {IJCAI} 2020)
null (Ed.)
Graph Neural Networks (GNNs) have been shown to be powerful in a wide range of graph-related tasks. While there exists various GNN models, a critical common ingredient is neighborhood aggregation, where the embedding of each node is updated by referring to the embedding of its neighbors. This paper aims to provide a better understanding of this mechanisms by asking the following question: Is neighborhood aggregation always necessary and beneficial? In short, the answer is no. We carve out two conditions under which neighborhood aggregation is not helpful: (1) when a node's neighbors are highly dissimilar and (2) when a node's embedding is already similar with that of its neighbors. We propose novel metrics that quantitatively measure these two circumstances and integrate them into an Adaptive-layer module. Our experiments show that allowing for node-specific aggregation degrees have significant advantage over current GNNs.
more » « less
Full Text Available
Joint Aspect-Sentiment Analysis with Minimal User Guidance

https://doi.org/10.1145/3397271.3401179

Zhuang, Honglei; Guo, Fang; Zhang, Chao; Liu, Liyuan; Han, Jiawei (July 2020, Proceedings of the 43rd International {ACM} {SIGIR} conference on research and development in Information Retrieval, {SIGIR} 2020, July 25-30, 2020)
null (Ed.)
Full Text Available
NetTaxo: Automated Topic Taxonomy Construction from Text-Rich Network

https://doi.org/10.1145/3366423.3380259

Shang, Jingbo; Zhang, Xinyang; Liu, Liyuan; Li, Sha; Han, Jiawei (April 2020, WWW '20: The Web Conference 2020)

The automated construction of topic taxonomies can benefit numerous applications, including web search, recommendation, and knowledge discovery. One of the major advantages of automatic taxonomy construction is the ability to capture corpus-specific information and adapt to different scenarios. To better reflect the characteristics of a corpus, we take the meta-data of documents into consideration and view the corpus as a text-rich network. In this paper, we propose NetTaxo, a novel automatic topic taxonomy construction framework, which goes beyond the existing paradigm and allows text data to collaborate with network structure. Specifically, we learn term embeddings from both text and network as contexts. Network motifs are adopted to capture appropriate network contexts. We conduct an instance-level selection for motifs, which further refines term embedding according to the granularity and semantics of each taxonomy node. Clustering is then applied to obtain sub-topics under a taxonomy node. Extensive experiments on two real-world datasets demonstrate the superiority of our method over the state-of-the-art, and further verify the effectiveness and importance of instance-level motif selection.
more » « less
Full Text Available
TaxoExpan: Self-supervised Taxonomy Expansion with Position-Enhanced Graph Neural Network

https://doi.org/10.1145/3366423.3380132

Shen, Jiaming; Shen, Zhihong; Xiong, Chenyan; Wang, Chi; Wang, Kuansan; Han, Jiawei (April 2020, WWW '20: The Web Conference 2020)

Taxonomies consist of machine-interpretable semantics and provide valuable knowledge for many web applications. For example, online retailers (e.g., Amazon and eBay) use taxonomies for product recommendation, and web search engines (e.g., Google and Bing) leverage taxonomies to enhance query understanding. Enormous efforts have been made on constructing taxonomies eithermanually or semi-automatically. However, with the fast-growing volume of web content, existing taxonomies will become outdated and fail to capture emerging knowledge. Therefore, in many applications, dynamic expansions of an existing taxonomy are in great demand. In this paper, we study how to expand an existing taxonomy by adding a set of new concepts. We propose a novel self-supervised framework, named TaxoExpan, which automatically generates a set of ⟨query concept, anchor concept⟩ pairs from the existing taxonomy as training data. Using such self-supervision data, TaxoExpan learns a model to predict whether a query concept is the direct hyponym of an anchor concept. We develop two innovative techniques in TaxoExpan: (1) a position-enhanced graph neural network that encodes the local structure of an anchor concept in the existing taxonomy, and (2) a noise-robust training objective that enables the learned model to be insensitive to the label noise in the self-supervision data. Extensive experiments on three large-scale datasets from different domains demonstrate both the effectiveness and the efficiency of TaxoExpan for taxonomy expansion.
more » « less
Full Text Available
Discriminative Topic Mining via Category-Name Guided Text Embedding

https://doi.org/10.1145/3366423.3380278

Meng, Yu; Huang, Jiaxin; Wang, Guangyuan; Wang, Zihan; Zhang, Chao; Zhang, Yu; Han, Jiawei (April 2020, WWW '20: The Web Conference 2020)

Mining a set of meaningful and distinctive topics automatically from massive text corpora has broad applications. Existing topic models, however, typically work in a purely unsupervised way, which often generate topics that do not fit users’ particular needs and yield suboptimal performance on downstream tasks. We propose a new task, discriminative topic mining, which leverages a set of user-provided category names to mine discriminative topics from text corpora. This new task not only helps a user understand clearly and distinctively the topics he/she is most interested in, but also benefits directly keyword-driven classification tasks. We develop CatE, a novel category-name guided text embedding method for discriminative topic mining, which effectively leverages minimal user guidance to learn a discriminative embedding space and discover category representative terms in an iterative manner. We conduct a comprehensive set of experiments to show that CatE mines highquality set of topics guided by category names only, and benefits a variety of downstream applications including weakly-supervised classification and lexical entailment direction identification.
more » « less
Full Text Available
Unsupervised Word Embedding Learning by Incorporating Local and Global Contexts

https://doi.org/10.3389/fdata.2020.00009

Meng, Yu; Huang, Jiaxin; Wang, Guangyuan; Wang, Zihan; Zhang, Chao; Han, Jiawei (March 2020, Frontiers in Big Data)
null (Ed.)
Full Text Available

« Prev Next »

Search for: All records