skip to main content

This content will become publicly available on January 1, 2023

Title: The Dark Machines Anomaly Score Challenge: Benchmark Data and Model Independent Event Classification for the Large Hadron Collider
We describe the outcome of a data challenge conducted as part of the Dark Machines ( initiative and the Les Houches 2019 workshop on Physics at TeV colliders. The challenged aims to detect signals of new physics at the Large Hadron Collider (LHC) using unsupervised machine learning algorithms. First, we propose how an anomaly score could be implemented to define model-independent signal regions in LHC searches. We define and describe a large benchmark dataset, consisting of >1 billion simulated LHC events corresponding to 10\, fb^{-1} 10 f b − 1 of proton-proton collisions at a center-of-mass energy of 13 TeV. We then review a wide range of anomaly detection and density estimation algorithms, developed in the context of the data challenge, and we measure their performance in a set of realistic analysis environments. We draw a number of useful conclusions that will aid the development of unsupervised new physics searches during the third run of the LHC, and provide our benchmark dataset for future studies at Code to reproduce the analysis is provided at
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; « less
Award ID(s):
Publication Date:
Journal Name:
SciPost Physics
Sponsoring Org:
National Science Foundation
More Like this
  1. Kasieczka, Gregor ; Nachman, Benjamin ; Shih, David (Ed.)
    A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a set of simulated collider events. Participants in these Olympics have developed their methods using an R&D dataset and then tested them on black boxes: datasets with an unknown anomaly (or not). This paper will review the LHC Olympics 2020more »challenge, including an overview of the competition, a description of methods deployed in the competition, lessons learned from the experience, and implications for data analyses with future datasets as well as future colliders.« less
  2. A bstract A search for new physics with non-resonant signals in dielectron and dimuon final states in the mass range above 2 TeV is presented. This is the first search for non-resonant signals in dilepton final states at the LHC to use a background estimate from the data. The data, corresponding to an integrated luminosity of 139 fb − 1 , were recorded by the ATLAS experiment in proton-proton collisions at a center-of-mass energy of $$ \sqrt{s} $$ s = 13 TeV during Run 2 of the Large Hadron Collider. The benchmark signal signature is a two-quark and two-lepton contactmore »interaction, which would enhance the dilepton event rate at the TeV mass scale. To model the contribution from background processes a functional form is fit to the dilepton invariant-mass spectra in data in a mass region below the region of interest. It is then extrapolated to a high-mass signal region to obtain the expected background there. No significant deviation from the expected background is observed in the data. Upper limits at 95% CL on the number of events and the visible cross-section times branching fraction for processes involving new physics are provided. Observed (expected) 95% CL lower limits on the contact interaction energy scale reach 35.8 (37.6) TeV.« less
  3. A bstract A search for a light pseudoscalar Higgs boson (a) decaying from the 125 GeV (or a heavier) scalar Higgs boson (H) is performed using the 2016 LHC proton-proton collision data at $$ \sqrt{s} $$ s = 13 TeV, corresponding to an integrated luminosity of 35 . 9 fb − 1 , collected by the CMS experiment. The analysis considers gluon fusion and vector boson fusion production of the H, followed by the decay H → aa → μμττ , and considers pseudoscalar masses in the range 3 . 6 < m a < 21 GeV. Because of themore »large mass difference between the H and the a bosons and the small masses of the a boson decay products, both the μμ and the ττ pairs have high Lorentz boost and are collimated. The ττ reconstruction efficiency is increased by modifying the standard technique for hadronic τ lepton decay reconstruction to account for a nearby muon. No significant signal is observed. Model-independent limits are set at 95% confidence level, as a function of m a , on the branching fraction (ℬ) for H → aa → μμττ , down to 1 . 5 (2 . 0) × 10 − 4 for m H = 125 (300) GeV. Model-dependent limits on ℬ(H → aa) are set within the context of two Higgs doublets plus singlet models, with the most stringent results obtained for Type-III models. These results extend current LHC searches for heavier a bosons that decay to resolved lepton pairs and provide the first such bounds for an H boson with a mass above 125 GeV.« less
  4. A bstract The results of a search for new phenomena in final states with b -jets and missing transverse momentum using 139 fb − 1 of proton-proton data collected at a centre-of-mass energy $$ \sqrt{s} $$ s = 13 TeV by the ATLAS detector at the LHC are reported. The analysis targets final states produced by the decay of a pair-produced supersymmetric bottom squark into a bottom quark and a stable neutralino. The analysis also seeks evidence for models of pair production of dark matter particles produced through the decay of a generic scalar or pseudoscalar mediator state in associationmore »with a pair of bottom quarks, and models of pair production of scalar third-generation down-type leptoquarks. No significant excess of events over the Standard Model background expectation is observed in any of the signal regions considered by the analysis. Bottom squark masses below 1270 GeV are excluded at 95% confidence level if the neutralino is massless. In the case of nearly mass-degenerate bottom squarks and neutralinos, the use of dedicated secondary-vertex identification techniques permits the exclusion of bottom squarks with masses up to 660 GeV for mass splittings between the squark and the neutralino of 10 GeV. These limits extend substantially beyond the regions of parameter space excluded by similar ATLAS searches performed previously.« less
  5. A bstract A search is presented for new particles produced at the LHC in proton-proton collisions at $$ \sqrt{s} $$ s = 13 TeV, using events with energetic jets and large missing transverse momentum. The analysis is based on a data sample corresponding to an integrated luminosity of 101 fb − 1 , collected in 2017–2018 with the CMS detector. Machine learning techniques are used to define separate categories for events with narrow jets from initial-state radiation and events with large-radius jets consistent with a hadronic decay of a W or Z boson. A statistical combination is made with anmore »earlier search based on a data sample of 36 fb − 1 , collected in 2016. No significant excess of events is observed with respect to the standard model background expectation determined from control samples in data. The results are interpreted in terms of limits on the branching fraction of an invisible decay of the Higgs boson, as well as constraints on simplified models of dark matter, on first-generation scalar leptoquarks decaying to quarks and neutrinos, and on models with large extra dimensions. Several of the new limits, specifically for spin-1 dark matter mediators, pseudoscalar mediators, colored mediators, and leptoquarks, are the most restrictive to date.« less