Learning to Rank Entities for Set Expansion from Unstructured Data

Yu, Puxuan; Rahimi, Razieh; Huang, Zhiqi; Allan, James

doi:10.1145/3409256.3409811

Citation Details

Learning to Rank Entities for Set Expansion from Unstructured Data

Entity set expansion (ESE) refers to mining ``siblings'' of some user-provided seed entities from unstructured data. It has drawn increasing attention in the IR and NLP communities for its various applications. To the best of our knowledge, there has not been any work towards a supervised neural model for entity set expansion from unstructured data. We suspect that the main reason is the lack of massive annotated entity sets. In order to solve this problem, we propose and implement a toolkit called {DBpedia-Sets}, which automatically extracts entity sets from any plain text collection and can provide a large number of distant supervision data for neural model training. We propose a two-channel neural re-ranking model {NESE} that jointly learns exact and semantic matching of entity contexts. The former accepts entity-context co-occurrence information and the latter learns a non-linear transformer from generally pre-trained embeddings to ESE-task specific embeddings for entities. Experiments on real datasets of different scales from different domains show that {NESE} outperforms state-of-the-art approaches in terms of precision and MAP, where the improvements are statistically significant and are higher when the given corpus is larger. more »

Award ID(s):: 1813662

PAR ID:: 10228381

Author(s) / Creator(s):: Yu, Puxuan; Rahimi, Razieh; Huang, Zhiqi; Allan, James

Date Published:: 2020-09-14

Journal Name:: Proceedings of the ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR 2020)

Page Range / eLocation ID:: 21 to 28

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3409256.3409811

More Like this