Retrieval as Attention: End-to-end Learning of Retrieval and Reading within a Single Transformer

Jiang, Zhengbao; Gao, Luyu; Wang, Zhiruo; Araki, Jun; Ding, Haibo; Callan, Jamie; Neubig, Graham

doi:10.18653/v1/2022.emnlp-main.149

Citation Details

Retrieval as Attention: End-to-end Learning of Retrieval and Reading within a Single Transformer

Systems for knowledge-intensive tasks such as open-domain question answering (QA) usually consist of two stages: efficient retrieval of relevant documents from a large corpus and detailed reading of the selected documents. This is usually done through two separate models, a retriever that encodes the query and finds nearest neighbors, and a reader based on Transformers. These two components are usually modeled separately, which necessitates a cumbersome implementation and is awkward to optimize in an end-to-end fashion. In this paper, we revisit this design and eschew the separate architecture and training in favor of a single Transformer that performs retrieval as attention (RAA), and end-to-end training solely based on supervision from the end QA task. We demonstrate for the first time that an end-to-end trained single Transformer can achieve both competitive retrieval and QA performance on in-domain datasets, matching or even slightly outperforming state-of-the-art dense retrievers and readers. Moreover, end-to-end adaptation of our model significantly boosts its performance on out-of-domain datasets in both supervised and unsupervised settings, making our model a simple and adaptable end-to-end solution for knowledge-intensive tasks. more »

Award ID(s):: 1815528

PAR ID:: 10479608

Author(s) / Creator(s):: Jiang, Zhengbao; Gao, Luyu; Wang, Zhiruo; Araki, Jun; Ding, Haibo; Callan, Jamie; Neubig, Graham

Publisher / Repository:: Association for Computational Linguistics

Date Published:: 2022-01-01

Page Range / eLocation ID:: 2336 to 2349

Format(s):: Medium: X

Location:: Abu Dhabi, United Arab Emirates

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.18653/v1/2022.emnlp-main.149

More Like this