AutoName: A Corpus-Based Set Naming Framework

Huang, Zhiqi; Rahimi, Razieh; Yu, Puxuan; Shang, Jingbo; Allan, James

doi:10.1145/3404835.3463100

Citation Details

AutoName: A Corpus-Based Set Naming Framework

Inferring the set name of semantically grouped entities is useful in many tasks related to natural language processing and information retrieval. Previous studies mainly draw names from knowledge bases to ensure high quality, but that limits the candidate scope. We propose an unsupervised framework, AutoName, that exploits large-scale text corpora to name a set of query entities. Specifically, it first extracts hypernym phrases as candidate names from query-related documents via probing a pre-trained language model. A hierarchical density-based clustering is then applied to form potential concepts for these candidate names. Finally, AutoName ranks candidates and picks the top one as the set name based on constituents of the phrase and the semantic similarity of their concepts. We also contribute a new benchmark dataset for this task, consisting of 130 entity sets with name labels. Experimental results show that AutoName generates coherent and meaningful set names and significantly outperforms all compared methods. Further analyses show that AutoName is able to offer explanations for extracted names using the sentences most relevant to the corresponding concept. more »

Award ID(s):: 1813662

PAR ID:: 10276024

Author(s) / Creator(s):: Huang, Zhiqi; Rahimi, Razieh; Yu, Puxuan; Shang, Jingbo; Allan, James

Date Published:: 2021-07-11

Journal Name:: Proceedings of The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 21)

Page Range / eLocation ID:: 2101 to 2105

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3404835.3463100

More Like this