skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The Dark Machines Anomaly Score Challenge: Benchmark Data and Model Independent Event Classification for the Large Hadron Collider
We describe the outcome of a data challenge conducted as part of the Dark Machines (https://www.darkmachines.org) initiative and the Les Houches 2019 workshop on Physics at TeV colliders. The challenged aims to detect signals of new physics at the Large Hadron Collider (LHC) using unsupervised machine learning algorithms. First, we propose how an anomaly score could be implemented to define model-independent signal regions in LHC searches. We define and describe a large benchmark dataset, consisting of >1 billion simulated LHC events corresponding to 10\, fb^{-1} 10 f b − 1 of proton-proton collisions at a center-of-mass energy of 13 TeV. We then review a wide range of anomaly detection and density estimation algorithms, developed in the context of the data challenge, and we measure their performance in a set of realistic analysis environments. We draw a number of useful conclusions that will aid the development of unsupervised new physics searches during the third run of the LHC, and provide our benchmark dataset for future studies at https://www.phenoMLdata.org. Code to reproduce the analysis is provided at https://github.com/bostdiek/DarkMachines-UnsupervisedChallenge.  more » « less
Award ID(s):
2019786
PAR ID:
10323045
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; « less
Date Published:
Journal Name:
SciPost Physics
Volume:
12
Issue:
1
ISSN:
2542-4653
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Kasieczka, Gregor; Nachman, Benjamin; Shih, David (Ed.)
    A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a set of simulated collider events. Participants in these Olympics have developed their methods using an R&D dataset and then tested them on black boxes: datasets with an unknown anomaly (or not). This paper will review the LHC Olympics 2020 challenge, including an overview of the competition, a description of methods deployed in the competition, lessons learned from the experience, and implications for data analyses with future datasets as well as future colliders. 
    more » « less
  2. Abstract This paper presents a model-agnostic search for narrow resonances in the dijet final state in the mass range 1.8–6 TeV. The signal is assumed to produce jets with substructure atypical of jets initiated by light quarks or gluons, with minimal additional assumptions. Search regions are obtained by utilizing multivariate machine-learning methods to select jets with anomalous substructure. A collection of complementary anomaly detection methods—based on unsupervised, weakly supervised, and semisupervised algorithms—are used in order to maximize the sensitivity to unknown new physics signatures. These algorithms are applied to data corresponding to an integrated luminosity of 138 fb−1, recorded by the CMS experiment at the LHC, at a center-of-mass energy of 13 TeV. No significant excesses above background expectations are seen. Exclusion limits are derived on the production cross section of benchmark signal models varying in resonance mass, jet mass, and jet substructure. Many of these signatures have not been previously sought, making several of the limits reported on the corresponding benchmark models the first ever. When compared to benchmark inclusive and substructure-based search strategies, the anomaly detection methods are found to significantly enhance the sensitivity to a variety of models. 
    more » « less
  3. Abstract The CERN LHC provided proton and heavy ion collisions during its Run 2 operation period from 2015 to 2018. Proton-proton collisions reached a peak instantaneous luminosity of 2.1× 1034cm-2s-1, twice the initial design value, at √(s)=13 TeV. The CMS experiment records a subset of the collisions for further processing as part of its online selection of data for physics analyses, using a two-level trigger system: the Level-1 trigger, implemented in custom-designed electronics, and the high-level trigger, a streamlined version of the offline reconstruction software running on a large computer farm. This paper presents the performance of the CMS high-level trigger system during LHC Run 2 for physics objects, such as leptons, jets, and missing transverse momentum, which meet the broad needs of the CMS physics program and the challenge of the evolving LHC and detector conditions. Sophisticated algorithms that were originally used in offline reconstruction were deployed online. Highlights include a machine-learning b tagging algorithm and a reconstruction algorithm for tau leptons that decay hadronically. 
    more » « less
  4. Abstract The production of heavy neutral mass resonances, $$\text {Z}^{\prime }$$ Z ′ , has been widely studied theoretically and experimentally. Although the nature, mass, couplings, and associated quantum numbers of this hypothetical particle are yet to be determined, current LHC experimental results have set strong constraints assuming the simplest beyond Standard Model (SM) hypotheses. We present a new feasibility study on the production of a $$\text {Z}^{\prime }$$ Z ′ boson at the LHC, with family non-universal couplings, considering proton–proton collisions at $$\sqrt{s} = 13$$ s = 13 and 14 TeV. Such a hypothesis is well motivated theoretically and it can explain observed differences between SM predictions and experimental results, as well as being a useful tool to further probe recent results in searches for new physics considering non-universal fermion couplings. We work under two simplified phenomenological frameworks where the $$\textrm{Z}^{\prime }$$ Z ′ masses and couplings to the SM particles are free parameters, and consider final states of the $$\text {Z}^{\prime }$$ Z ′ decaying to a pair of $$\textrm{b}$$ b quarks. The analysis is performed using machine learning techniques to maximize the sensitivity. Despite being a well motivated physics case in its own merit, such scenarios have not been fully considered in ongoing searches at the LHC. We note the proposed search methodology can be a key mode for discovery over a large mass range, including low masses, traditionally considered difficult due to experimental constrains. In addition, the proposed search is complementary to existing strategies. 
    more » « less
  5. This report presents a comprehensive collection of searches for new physics performed by the ATLAS Collaboration during the Run~2 period of data taking at the Large Hadron Collider, from 2015 to 2018, corresponding to about 140~$$^{-1}$$ of $$\sqrt{s}=13$$~TeV proton--proton collision data. These searches cover a variety of beyond-the-standard model topics such as dark matter candidates, new vector bosons, hidden-sector particles, leptoquarks, or vector-like quarks, among others. Searches for supersymmetric particles or extended Higgs sectors are explicitly excluded as these are the subject of separate reports by the Collaboration. For each topic, the most relevant searches are described, focusing on their importance and sensitivity and, when appropriate, highlighting the experimental techniques employed. In addition to the description of each analysis, complementary searches are compared, and the overall sensitivity of the ATLAS experiment to each type of new physics is discussed. Summary plots and statistical combinations of multiple searches are included whenever possible. 
    more » « less