COIL: Revisit exact lexical match in information retrieval with contextualized inverted list

Gao, Luyu; Dai, Zhuyun; Callan, Jamie

Citation Details

Classical information retrieval systems such asBM25 rely on exact lexical match and carryout search efficiently with inverted list index. Recent neural IR models shifts towards soft semantic matching all query document terms,but they lose the computation efficiency of exact match systems.This paper presents COIL, a contextualized exact match retrieval architecture that brings semantic lexical matching. COIL scoring is based on overlapping query document tokens’ contextualized representations. The new architecture stores contextualized token representations in inverted lists, bringing together the efficiency of exact match and the representation power of deep language models. Our experimental results show COIL outperforms classical lexical retrievers and state-of-the-art deep LM retrievers with similar or smaller latency. more »

Award ID(s):: 1815528

PAR ID:: 10273594

Author(s) / Creator(s):: Gao, Luyu; Dai, Zhuyun; Callan, Jamie

Date Published:: 2021-06-06

Journal Name:: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Page Range / eLocation ID:: 3030-3042

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this