Large Language Models (LLMs) are so powerful that they sometimes learn correlations between labels and features that are irrelevant to the task, leading to poor generalization on out-of-distribution data. We propose explanation-based finetuning as a general approach to mitigate LLMs’ reliance on spurious correlations. Unlike standard finetuning where the model only predicts the answer given the input, we finetune the model to additionally generate a free-text explanation supporting its answer. To evaluate our method, we finetune the model on artificially constructed training sets containing different types of spurious cues, and test it on a test set without these cues. Compared to standard finetuning, our method makes GPT-3 (davinci) remarkably more robust against spurious cues in terms of accuracy drop across four classification tasks: ComVE (+1.2), CREAK (+9.1), e-SNLI (+15.4), and SBIC (+6.5). The efficacy generalizes across multiple model families and scales, with greater gains for larger models. Finally, our method also works well with explanations generated by the model, implying its applicability to more datasets without human-written explanations.
more »
« less
Are all models wrong? Falsifying binary formation models in gravitational-wave astronomy using exceptional events
ABSTRACT As the catalogue of gravitational-wave transients grows, several entries appear ‘exceptional’ within the population. Tipping the scales with a total mass of $$\sim 150 \,{\rm M}_\odot$$, GW190521 likely contained black holes in the pair-instability mass gap. The event GW190814, meanwhile, is unusual for its extreme mass ratio and the mass of its secondary component. A growing model-building industry has emerged to provide explanations for such exceptional events, and Bayesian model selection is frequently used to determine the most informative model. However, Bayesian methods can only take us so far. They provide no answer to the question: does our model provide an adequate explanation for exceptional events in the data? If none of the models we are testing provide an adequate explanation, then it is not enough to simply rank our existing models – we need new ones. In this paper, we introduce a method to answer this question with a frequentist p-value. We apply the method to different models that have been suggested to explain the unusually massive event GW190521: hierarchical mergers in active galactic nuclei and globular clusters. We show that some (but not all) of these models provide adequate explanations for exceptionally massive events like GW190521.
more »
« less
- PAR ID:
- 10556402
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Monthly Notices of the Royal Astronomical Society
- Volume:
- 535
- Issue:
- 3
- ISSN:
- 0035-8711
- Format(s):
- Medium: X Size: p. 2837-2843
- Size(s):
- p. 2837-2843
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract As catalogs of gravitational-wave transients grow, new records are set for the most extreme systems observed to date. The most massive observed black holes probe the physics of pair-instability supernovae while providing clues about the environments in which binary black hole systems are assembled. The least massive black holes, meanwhile, allow us to investigate the purported neutron star–black hole mass gap, and binaries with unusually asymmetric mass ratios or large spins inform our understanding of binary and stellar evolution. Existing outlier tests generally implement leave-one-out analyses, but these do not account for the fact that the event being left out was by definition an extreme member of the population. This results in a bias in the evaluation of outliers. We correct for this bias by introducing a coarse-graining framework to investigate whether these extremal events are true outliers or whether they are consistent with the rest of the observed population. Our method enables us to study extremal events while testing for population model misspecification. We show that this ameliorates biases present in the leave-one-out analyses commonly used within the gravitational-wave community. Applying our method to results from the second LIGO–Virgo transient catalog, we find qualitative agreement with the conclusions of Abbott et al. GW190814 is an outlier because of its small secondary mass. We find that neither GW190412 nor GW190521 is an outlier.more » « less
-
Developing methods of automated inference that are able to provide users with compelling human-readable justifications for why the answer to a question is correct is critical for domains such as science and medicine, where user trust and detecting costly errors are limiting factors to adoption. One of the central barriers to training question answering models on explainable inference tasks is the lack of gold explanations to serve as training data. In this paper we present a corpus of explanations for standardized science exams, a recent challenge task for question answering. We manually construct a corpus of detailed explanations for nearly all publicly available standardized elementary science question (approximately 1,680 3 rd through 5 th grade questions) and represent these as “explanation graphs” - sets of lexically overlapping sentences that describe how to arrive at the correct answer to a question through a combination of domain and world knowledge. We also provide an explanation-centered tablestore, a collection of semi-structured tables that contain the knowledge to construct these elementary science explanations. Together, these two knowledge resources map out a substantial portion of the knowledge required for answering and explaining elementary science exams, and provide both structured and free-text training data for the explainable inference task.more » « less
-
Abstract GW190521 was the most massive black hole merger discovered by LIGO/Virgo so far, with masses in tension with stellar evolution models. A possible explanation of such heavy black holes is that they themselves are the remnants of previous mergers of lighter black holes. Here we estimate the masses of the ancestral black holes of GW190521, assuming it is the end product of previous mergers. We find that the heaviest parental black holes has a mass of M⊙(90% credible level). We find 70% probability that it is in the 50M⊙–120M⊙mass gap, indicating that it may also be the end product of a previous merger. We therefore also compute the expected mass distributions of the “grandparent” black holes of GW190521, assuming they existed. Ancestral black hole masses could represent an additional puzzle piece in identifying the origin of LIGO/Virgo/KAGRA’s heaviest black holes.more » « less
-
Most of the work on interpretable machine learning has focused on designing either inherently interpretable models, which typically trade-off accuracy for interpretability, or post-hoc explanation systems, whose explanation quality can be unpredictable. Our method, ExpO, is a hybridization of these approaches that regularizes a model for explanation quality at training time. Importantly, these regularizers are differentiable, model agnostic, and require no domain knowledge to define. We demonstrate that post-hoc explanations for ExpO-regularized models have better explanation quality, as measured by the common fidelity and stability metrics. We verify that improving these metrics leads to significantly more useful explanations with a user study on a realistic task.more » « less
An official website of the United States government
