skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Are all models wrong? Falsifying binary formation models in gravitational-wave astronomy using exceptional events
ABSTRACT As the catalogue of gravitational-wave transients grows, several entries appear ‘exceptional’ within the population. Tipping the scales with a total mass of $$\sim 150 \,{\rm M}_\odot$$, GW190521 likely contained black holes in the pair-instability mass gap. The event GW190814, meanwhile, is unusual for its extreme mass ratio and the mass of its secondary component. A growing model-building industry has emerged to provide explanations for such exceptional events, and Bayesian model selection is frequently used to determine the most informative model. However, Bayesian methods can only take us so far. They provide no answer to the question: does our model provide an adequate explanation for exceptional events in the data? If none of the models we are testing provide an adequate explanation, then it is not enough to simply rank our existing models – we need new ones. In this paper, we introduce a method to answer this question with a frequentist p-value. We apply the method to different models that have been suggested to explain the unusually massive event GW190521: hierarchical mergers in active galactic nuclei and globular clusters. We show that some (but not all) of these models provide adequate explanations for exceptionally massive events like GW190521.  more » « less
Award ID(s):
2146528
PAR ID:
10556402
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Monthly Notices of the Royal Astronomical Society
Volume:
535
Issue:
3
ISSN:
0035-8711
Format(s):
Medium: X Size: p. 2837-2843
Size(s):
p. 2837-2843
Sponsoring Org:
National Science Foundation
More Like this
  1. Large Language Models (LLMs) are so powerful that they sometimes learn correlations between labels and features that are irrelevant to the task, leading to poor generalization on out-of-distribution data. We propose explanation-based finetuning as a general approach to mitigate LLMs’ reliance on spurious correlations. Unlike standard finetuning where the model only predicts the answer given the input, we finetune the model to additionally generate a free-text explanation supporting its answer. To evaluate our method, we finetune the model on artificially constructed training sets containing different types of spurious cues, and test it on a test set without these cues. Compared to standard finetuning, our method makes GPT-3 (davinci) remarkably more robust against spurious cues in terms of accuracy drop across four classification tasks: ComVE (+1.2), CREAK (+9.1), e-SNLI (+15.4), and SBIC (+6.5). The efficacy generalizes across multiple model families and scales, with greater gains for larger models. Finally, our method also works well with explanations generated by the model, implying its applicability to more datasets without human-written explanations. 
    more » « less
  2. Abstract As catalogs of gravitational-wave transients grow, new records are set for the most extreme systems observed to date. The most massive observed black holes probe the physics of pair-instability supernovae while providing clues about the environments in which binary black hole systems are assembled. The least massive black holes, meanwhile, allow us to investigate the purported neutron star–black hole mass gap, and binaries with unusually asymmetric mass ratios or large spins inform our understanding of binary and stellar evolution. Existing outlier tests generally implement leave-one-out analyses, but these do not account for the fact that the event being left out was by definition an extreme member of the population. This results in a bias in the evaluation of outliers. We correct for this bias by introducing a coarse-graining framework to investigate whether these extremal events are true outliers or whether they are consistent with the rest of the observed population. Our method enables us to study extremal events while testing for population model misspecification. We show that this ameliorates biases present in the leave-one-out analyses commonly used within the gravitational-wave community. Applying our method to results from the second LIGO–Virgo transient catalog, we find qualitative agreement with the conclusions of Abbott et al. GW190814 is an outlier because of its small secondary mass. We find that neither GW190412 nor GW190521 is an outlier. 
    more » « less
  3. Developing methods of automated inference that are able to provide users with compelling human-readable justifications for why the answer to a question is correct is critical for domains such as science and medicine, where user trust and detecting costly errors are limiting factors to adoption. One of the central barriers to training question answering models on explainable inference tasks is the lack of gold explanations to serve as training data. In this paper we present a corpus of explanations for standardized science exams, a recent challenge task for question answering. We manually construct a corpus of detailed explanations for nearly all publicly available standardized elementary science question (approximately 1,680 3 rd through 5 th grade questions) and represent these as “explanation graphs” - sets of lexically overlapping sentences that describe how to arrive at the correct answer to a question through a combination of domain and world knowledge. We also provide an explanation-centered tablestore, a collection of semi-structured tables that contain the knowledge to construct these elementary science explanations. Together, these two knowledge resources map out a substantial portion of the knowledge required for answering and explaining elementary science exams, and provide both structured and free-text training data for the explainable inference task. 
    more » « less
  4. Abstract GW190521 was the most massive black hole merger discovered by LIGO/Virgo so far, with masses in tension with stellar evolution models. A possible explanation of such heavy black holes is that they themselves are the remnants of previous mergers of lighter black holes. Here we estimate the masses of the ancestral black holes of GW190521, assuming it is the end product of previous mergers. We find that the heaviest parental black holes has a mass of 56 18 + 20 M(90% credible level). We find 70% probability that it is in the 50M–120Mmass gap, indicating that it may also be the end product of a previous merger. We therefore also compute the expected mass distributions of the “grandparent” black holes of GW190521, assuming they existed. Ancestral black hole masses could represent an additional puzzle piece in identifying the origin of LIGO/Virgo/KAGRA’s heaviest black holes. 
    more » « less
  5. Abstract We propose a Bayesian inference framework to predict the merger history of LIGO-Virgo binary black holes (BHs), whose binary components may have undergone hierarchical mergers in the past. The framework relies on numerical relativity predictions for the mass, spin, and kick velocity of the remnant BHs. This proposed framework computes the masses, spins, and kicks imparted to the remnant of the parent binaries, given the initial masses and spin magnitudes of the binary constituents. We validate our approach by performing an “injection study” based on a constructed sequence of hierarchically formed binaries. Noise is added to the final binary in the sequence, and the parameters of the “parent” and “grandparent” binaries in the merger chain are then reconstructed. This method is then applied to three GWTC-3 events: GW190521, GW200220_061928, and GW190426_190642. These events were selected because at least one of the binary companions lies in the putative pair-instability supernova mass gap, in which stellar processes alone cannot produce BHs. Hierarchical mergers offer a natural explanation for the formation of BHs in the pair-instability mass gap. We use the backward evolution framework to predict the parameters of the parents of the primary companion of these three binaries. For instance, the parent binary of GW190521 has masses 72 22 + 32 M and 31 23 + 24 M within the 90% credible interval. Astrophysical environments with escape speeds ≥100 km s−1are preferred sites to host these events. Our approach can be readily applied to future high-mass gravitational wave events to predict their formation history under the hierarchical merger assumption. 
    more » « less