SearchIE: A Retrieval Approach for Information Extraction

Sarwar, Sheikh Muhammad; Allan, James

doi:10.1145/3341981.3344248

Citation Details

SearchIE: A Retrieval Approach for Information Extraction

We address the problem of entity extraction with a very few examples and address it with an information retrieval approach. Existing extraction approaches consider millions of features extracted from a large number of training data cases. Generally, these data cases are generated by a distant supervision approach with entities in a knowledge base. After that a model is learned and entities are extracted. However, with extremely limited data a ranked list of relevant entities can be helpful to obtain user feedback to get more training data. As Information Retrieval (IR) is a natural choice for ranked list generation, we explore its effectiveness in such a limited data case. To this end, we propose SearchIE, a hybrid of IR and NLP approach that indexes documents represented using handcrafted NLP features. At query time SearchIE samples terms from a Logistic Regression model trained with extremely limited data. We show that SearchIE supersedes state-of-the-art NLP models to find civilians killed by US police officers with even a single civilian name as example. more »

Award ID(s):: 1617408

PAR ID:: 10175987

Author(s) / Creator(s):: Sarwar, Sheikh Muhammad; Allan, James

Date Published:: 2019-09-23

Journal Name:: Proceedings of the International Conference on the Theory of Information Retrieval (ICTIR '19)

Page Range / eLocation ID:: 249 to 252

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3341981.3344248

More Like this