ISD-QA: Iterative Distillation of Commonsense Knowledge from General Language Models for Unsupervised Question Answering

Ramamurthy, Priyadharsini; Aakur, Sathyanarayanan

Citation Details

Commonsense question answering has primarily been tackled through supervised transfer learning, where a language model pre-trained on large amounts of data is used as the starting point. While successful, the approach requires large amounts of labeled question-answer pairs, with increasingly larger amounts of data required as the complexity of scenarios or tasks such as commonsense QA increases. In this paper, we hypothesize that large-scale pre-training of language models encodes the necessary commonsense knowledge to answer common questions in context without labeled data. We propose a novel framework called Iterative Self Distillation for QA (ISD-QA), which extracts the “dark knowledge” encoded during largescale pre-training of language models to provide supervision for commonsense question answering. We show that the approach can be used to train common neural QA models for commonsense question answering by distilling knowledge from language models in an unsupervised manner. With no bells and whistles, we achieve an average of 68% of the performance of fully supervised QA models while requiring no labeled training data. Extensive experiments on three public benchmarks (OpenBookQA, HellaSWAG, and CommonsenseQA) show the effectiveness of the proposed approach. more »

Award ID(s):: 2143150 1955230

PAR ID:: 10347250

Author(s) / Creator(s):: Ramamurthy, Priyadharsini; Aakur, Sathyanarayanan

Date Published:: 2022-10-01

Journal Name:: International Conference on Pattern Recognition

ISSN:: 1051-4651

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this