Efficiency Implications of Term Weighting for Passage Retrieval

Mackenzie, Joel; Dai, Zhuyun; Gallagher, Luke; Callan, Jamie

doi:10.1145/3397271.3401263

Citation Details

Efficiency Implications of Term Weighting for Passage Retrieval

Language model pre-training has spurred a great deal of attention for tasks involving natural language understanding, and has been successfully applied to many downstream tasks with impressive results. Within information retrieval, many of these solutions are too costly to stand on their own, requiring multi-stage ranking architectures. Recent work has begun to consider how to “backport” salient aspects of these computationally expensive models to previous stages of the retrieval pipeline. One such instance is DeepCT, which uses BERT to re-weight term importance in a given context at the passage level. This process, which is computed offline, results in an augmented inverted index with re-weighted term frequency values. In this work,we conduct an investigation of query processing efficiency over DeepCT indexes. Using a number of candidate generation algorithms, we reveal how term re-weighting can impact query processing latency, and explore how DeepCT can be used as a static index pruning technique to accelerate query processing without harming search effectiveness. more »

Award ID(s):: 1815528

NSF-PAR ID:: 10170070

Author(s) / Creator(s):: Mackenzie, Joel; Dai, Zhuyun; Gallagher, Luke; Callan, Jamie

Date Published:: 2020-07-27

Journal Name:: Proceedings of the 43nd International ACM SIGIR Conference on Research & Development in Information Retrieval

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3397271.3401263

More Like this