skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Efficiency Implications of Term Weighting for Passage Retrieval
Language model pre-training has spurred a great deal of attention for tasks involving natural language understanding, and has been successfully applied to many downstream tasks with impressive results. Within information retrieval, many of these solutions are too costly to stand on their own, requiring multi-stage ranking architectures. Recent work has begun to consider how to “backport” salient aspects of these computationally expensive models to previous stages of the retrieval pipeline. One such instance is DeepCT, which uses BERT to re-weight term importance in a given context at the passage level. This process, which is computed offline, results in an augmented inverted index with re-weighted term frequency values. In this work,we conduct an investigation of query processing efficiency over DeepCT indexes. Using a number of candidate generation algorithms, we reveal how term re-weighting can impact query processing latency, and explore how DeepCT can be used as a static index pruning technique to accelerate query processing without harming search effectiveness.  more » « less
Award ID(s):
1815528
PAR ID:
10170070
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the 43nd International ACM SIGIR Conference on Research & Development in Information Retrieval
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Many content-based image search and instance retrieval systems implement bag-of-visual-words strategies for candidate selection. Visual processing of an image results in hundreds of visual words that make up a document, and these words are used to build an inverted index. Query processing then consists of an initial candidate selection phase that queries the inverted index, followed by more complex reranking of the candidates using various image features. The initial phase typically uses disjunctive top-k query processing algorithms originally proposed for searching text collections. Our objective in this paper is to optimize the performance of disjunctive top-k computation for candidate selection in content-based instance retrieval systems. While there has been extensive previous work on optimizing this phase for textual search engines, we are unaware of any published work that studies this problem for instance retrieval, where both index and query data are quite different from the distributions commonly found and exploited in the textual case. Using data from a commercial large-scale instance retrieval system, we address this challenge in three steps. First, we analyze the quantitative properties of index structures and queries in the system, and discuss how they differ from the case of text retrieval. Second, we describe an optimized term-at-a-time retrieval strategy that significantly outperforms baseline term-at-a-time and document-at-a-time strategies, achieving up to 66% speed-up over the most efficient baseline. Finally, we show that due to the different properties of the data, several common safe and unsafe early termination techniques from the literature fail to provide any significant performance benefits. 
    more » « less
  2. An inverted index is the basic data structure used in most current large-scale information retrieval systems. It can be modeled as a collection of sorted sequences of integers. Many compression techniques for inverted indexes have been studied in the past, with some of them reaching tremendous decompression speeds through the use of SIMD instructions available on modern CPUs. While there has been some work on query processing algorithms for Graphics Processing Units (GPUs), little of it has focused on how to efficiently access compressed index structures, and we see some potential for significant improvements in decompression speed. In this paper, we describe and implement two encoding schemes for index decompression on GPU architectures. Their format and decoding algorithm is adapted from existing CPU-based compression methods to exploit the execution model and memory hierarchy offered by GPUs. We show that our solutions, GPU-BP and GPU-VByte, achieve significant speedups over their already carefully optimized CPU counterparts. 
    more » « less
  3. To accelerate software development, much research has been performed to help people understand and reuse the huge amount of available code resources. Two important tasks have been widely studied: code retrieval, which aims to retrieve code snippets relevant to a given natural language query from a code base, and code annotation, where the goal is to annotate a code snippet with a natural language description. Despite their advancement in recent years, the two tasks are mostly explored separately. In this work, we investigate a novel perspective of Code annotation for Code retrieval (hence called “CoaCor”), where a code annotation model is trained to generate a natural language annotation that can represent the semantic meaning of a given code snippet and can be leveraged by a code retrieval model to better distinguish relevant code snippets from others. To this end, we propose an effective framework based on reinforcement learning, which explicitly encourages the code annotation model to generate annotations that can be used for the retrieval task. Through extensive experiments, we show that code annotations generated by our framework are much more detailed and more useful for code retrieval, and they can further improve the performance of existing code retrieval models significantly. 
    more » « less
  4. Retrieval-augmented generation (RAG) systems can effectively address user queries by leveraging indexed document corpora to retrieve the relevant contexts. Ranking techniques have been adopted in RAG systems to sort the retrieved contexts by their relevance to the query so that users can select the most useful contexts for their downstream tasks. While many existing ranking methods rely on the similarity between the embedding vectors of the context and query to measure relevance, it is important to note that similarity does not equate to relevance in all scenarios. Some ranking methods use large language models (LLMs) to rank the contexts by putting the query and the candidate contexts in the prompt and asking LLM about their relevance. The scalability of those methods is contingent on the number of candidate contexts and the context window of those LLMs. Also, those methods require fine-tuning the LLMs, which can be computationally expensive and require domain-related data. In this work, we propose a scalable ranking framework that does not involve LLM training. Our framework uses an off-the-shelf LLM to hypothesize the user's query based on the retrieved contexts and ranks the contexts based on the similarity between the hypothesized queries and the user query. Our framework is efficient at inference time and is compatible with many other context retrieval and ranking techniques. Experimental results show that our method improves the ranking performance of retrieval systems in multiple benchmarks. 
    more » « less
  5. Static index pruning techniques remove postings from inverted index structures in order to decrease index size and query processing cost, while minimizing the resulting loss in result quality. A number of authors have proposed pruning techniques that use basic properties of postings as well as results of past queries to decide what postings should be kept. However, many open questions remain, and our goal is to address some of them using a machine learning based approach that tries to predict the usefulness of a posting. In this paper, we explore the following questions: (1) How much does an approach that learns from a rich set of features outperform previous work that uses heuristic approaches or just a few features? (2) What is the relationship between index size and query processing speed in static index pruning? We show that an approach that prunes postings using a rich set of features including post-hits and doc-hits can significantly outperform previous approaches, and that there is a very pronounced trade-off between index size and query processing speed for static index pruning that has not been previously explored. 
    more » « less