Large language models (LLMs) encode parametric knowledge about world facts and have shown remarkable performance in knowledge-driven NLP tasks. However, their reliance on parametric knowledge may cause them to overlook contextual cues, leading to incorrect predictions in context-sensitive NLP tasks (e.g., knowledge acquisition tasks). In this paper, we seek to assess and enhance LLMs’ contextual faithfulness in two aspects: knowledge conflict and prediction with abstention. We demonstrate that LLMs’ faithfulness can be significantly improved using carefully designed prompting strategies. In particular, we identify opinion-based prompts and counterfactual demonstrations as the most effective methods. Opinion-based prompts reframe the context as a narrator’s statement and inquire about the narrator’s opinions, while counterfactual demonstrations use instances containing false facts to improve faithfulness in knowledge conflict situations. Neither technique requires additional training. We conduct experiments on three datasets of two standard NLP tasks, machine reading comprehension and relation extraction, and the results demonstrate significant improvement in faithfulness to contexts. Code and data are released at https://github.com/wzhouad/context-faithful-llm.
more »
« less
Entity-Based Knowledge Conflicts in Question Answering
Knowledge-dependent tasks typically use two sources of knowledge: parametric, learned at training time, and contextual, given as a passage at inference time. To understand how models use these sources together, we formalize the problem of knowledge conflicts, where the contextual information contradicts the learned information. Analyzing the behaviour of popular models, we measure their over-reliance on memorized information (the cause of hallucinations), and uncover important factors that exacerbate this behaviour. Lastly, we propose a simple method to mitigate over-reliance on parametric knowledge, which minimizes hallucination, and improves out-of-distribution generalization by 4% - 7%. Our findings demonstrate the importance for practitioners to evaluate model tendency to hallucinate rather than read, and show that our mitigation strategy encourages generalization to evolving information (i.e. time-dependent queries). To encourage these practices, we have released our framework for generating knowledge conflicts.
more »
« less
- Award ID(s):
- 1817183
- PAR ID:
- 10462823
- Date Published:
- Journal Name:
- Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
- Page Range / eLocation ID:
- 7052 to 7063
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Background People’s health-related knowledge influences health outcomes, as this knowledge may influence whether individuals follow advice from their doctors or public health agencies. Yet, little attention has been paid to where people obtain health information and how these information sources relate to the quality of knowledge. Objective We aim to discover what information sources people use to learn about health conditions, how these sources relate to the quality of their health knowledge, and how both the number of information sources and health knowledge change over time. Methods We surveyed 200 different individuals at 12 time points from March through September 2020. At each time point, we elicited participants’ knowledge about causes, risk factors, and preventative interventions for 8 viral (Ebola, common cold, COVID-19, Zika) and nonviral (food allergies, amyotrophic lateral sclerosis [ALS], strep throat, stroke) illnesses. Participants were further asked how they learned about each illness and to rate how much they trust various sources of health information. Results We found that participants used different information sources to obtain health information about common illnesses (food allergies, strep throat, stroke) compared to emerging illnesses (Ebola, common cold, COVID-19, Zika). Participants relied mainly on news media, government agencies, and social media for information about emerging illnesses, while learning about common illnesses from family, friends, and medical professionals. Participants relied on social media for information about COVID-19, with their knowledge accuracy of COVID-19 declining over the course of the pandemic. The number of information sources participants used was positively correlated with health knowledge quality, though there was no relationship with the specific source types consulted. Conclusions Building on prior work on health information seeking and factors affecting health knowledge, we now find that people systematically consult different types of information sources by illness type and that the number of information sources people use affects the quality of individuals’ health knowledge. Interventions to disseminate health information may need to be targeted to where individuals are likely to seek out information, and these information sources differ systematically by illness type.more » « less
-
Entity bias widely affects pretrained (large) language models, causing them to rely on (biased) parametric knowledge to make unfaithful predictions. Although causality-inspired methods have shown great potential to mitigate entity bias, it is hard to precisely estimate the parameters of underlying causal models in practice. The rise of black-box LLMs also makes the situation even worse, because of their inaccessible parameters and uncalibrated logits. To address these problems, we propose a specific structured causal model (SCM) whose parameters are comparatively easier to estimate. Building upon this SCM, we propose causal intervention techniques to mitigate entity bias for both white-box and black-box settings. The proposed causal intervention perturbs the original entity with neighboring entities. This intervention reduces specific biasing information pertaining to the original entity while still preserving sufficient semantic information from similar entities. Under the white-box setting, our training-time intervention improves OOD performance of PLMs on relation extraction (RE) and machine reading comprehension (MRC) by 5.7 points and by 9.1 points, respectively. Under the black-box setting, our in-context intervention effectively reduces the entity-based knowledge conflicts of GPT-3.5, achieving up to 20.5 points of improvement of exact match accuracy on MRC and up to 17.6 points of reduction in memorization ratio on RE.more » « less
-
We study a nonparametric contextual bandit problem in which the expected reward functions belong to a Hölder class with smoothness parameter β. We show how this interpolates between two extremes that were previously studied in isolation: nondifferentiable bandits (β at most 1), with which rate-optimal regret is achieved by running separate noncontextual bandits in different context regions, and parametric-response bandits (infinite [Formula: see text]), with which rate-optimal regret can be achieved with minimal or no exploration because of infinite extrapolatability. We develop a novel algorithm that carefully adjusts to all smoothness settings, and we prove its regret is rate-optimal by establishing matching upper and lower bounds, recovering the existing results at the two extremes. In this sense, our work bridges the gap between the existing literature on parametric and nondifferentiable contextual bandit problems and between bandit algorithms that exclusively use global or local information, shedding light on the crucial interplay of complexity and regret in contextual bandits.more » « less
-
By design, large language models (LLMs) are static general-purpose models, expensive to retrain or update frequently. As they are increasingly adopted for knowledge-intensive tasks, it becomes evident that these design choices lead to failures to generate factual, relevant, and up-to-date knowledge. To this end, we propose Knowledge Card, a modular framework to plug in new factual and relevant knowledge into general-purpose LLMs. We first introduce knowledge cards---specialized language models trained on corpora from specific domains and sources. Knowledge cards serve as parametric repositories that are selected at inference time to generate background knowledge for the base LLM. We then propose three content selectors to dynamically select and retain information in documents generated by knowledge cards, specifically controlling for relevance, brevity, and factuality of outputs. Finally, we propose two complementary integration approaches to augment the base LLM with the (relevant, factual) knowledge curated from the specialized LMs. Through extensive experiments, we demonstrate that Knowledge Card achieves state-of-the-art performance on six benchmark datasets. Ultimately, Knowledge Card framework enables dynamic synthesis and updates of knowledge from diverse domains. Its modularity will ensure that relevant knowledge can be continuously updated through the collective efforts of the research community.more » « less
An official website of the United States government

