In high energy physics (HEP), analysis metadata comes in many forms—from theoretical cross-sections, to calibration corrections, to details about file processing. Correctly applying metadata is a crucial and often time-consuming step in an analysis, but designing analysis metadata systems has historically received little direct attention. Among other considerations, an ideal metadata tool should be easy to use by new analysers, should scale to large data volumes and diverse processing paradigms, and should enable future analysis reinterpretation. This document, which is the product of community discussions organised by the HEP Software Foundation, categorises types of metadata by scope and format and gives examples of current metadata solutions. Important design considerations for metadata systems, including sociological factors, analysis preservation efforts, and technical factors, are discussed. A list of best practices and technical requirements for future analysis metadata systems is presented. These best practices could guide the development of a future cross-experimental effort for analysis metadata tools.
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract -
Free, publicly-accessible full text available December 1, 2023
-
Abstract The accurate simulation of additional interactions at the ATLAS experiment for the analysis of proton–proton collisions delivered by the Large Hadron Collider presents a significant challenge to the computing resources. During the LHC Run 2 (2015–2018), there were up to 70 inelastic interactions per bunch crossing, which need to be accounted for in Monte Carlo (MC) production. In this document, a new method to account for these additional interactions in the simulation chain is described. Instead of sampling the inelastic interactions and adding their energy deposits to a hard-scatter interaction one-by-one, the inelastic interactions are presampled, independent of the hard scatter, and stored as combined events. Consequently, for each hard-scatter interaction, only one such presampled event needs to be added as part of the simulation chain. For the Run 2 simulation chain, with an average of 35 interactions per bunch crossing, this new method provides a substantial reduction in MC production CPU needs of around 20%, while reproducing the properties of the reconstructed quantities relevant for physics analyses with good accuracy.Free, publicly-accessible full text available December 1, 2023
-
Abstract The ATLAS experiment at the Large Hadron Collider has a broad physics programme ranging from precision measurements to direct searches for new particles and new interactions, requiring ever larger and ever more accurate datasets of simulated Monte Carlo events. Detector simulation with Geant4 is accurate but requires significant CPU resources. Over the past decade, ATLAS has developed and utilized tools that replace the most CPU-intensive component of the simulation—the calorimeter shower simulation—with faster simulation methods. Here, AtlFast3, the next generation of high-accuracy fast simulation in ATLAS, is introduced. AtlFast3 combines parameterized approaches with machine-learning techniques and is deployed to meet current and future computing challenges, and simulation needs of the ATLAS experiment. With highly accurate performance and significantly improved modelling of substructure within jets, AtlFast3 can simulate large numbers of events for a wide range of physics processes.Free, publicly-accessible full text available December 1, 2023
-
Free, publicly-accessible full text available November 1, 2023
-
Free, publicly-accessible full text available September 1, 2023
-
A bstract A search is presented for a heavy W′ boson resonance decaying to a B or T vector-like quark and a t or a b quark, respectively. The analysis is performed using proton-proton collisions collected with the CMS detector at the LHC. The data correspond to an integrated luminosity of 138 fb − 1 at a center-of-mass energy of 13 TeV. Both decay channels result in a signature with a t quark, a Higgs or Z boson, and a b quark, each produced with a significant Lorentz boost. The all-hadronic decays of the Higgs or Z boson and of the t quark are selected using jet substructure techniques to reduce standard model backgrounds, resulting in a distinct three-jet W′ boson decay signature. No significant deviation in data with respect to the standard model background prediction is observed. Upper limits are set at 95% confidence level on the product of the W′ boson cross section and the final state branching fraction. A W′ boson with a mass below 3.1 TeV is excluded, given the benchmark model assumption of democratic branching fractions. In addition, limits are set based on generalizations of these assumptions. These are the most sensitive limits to datemore »Free, publicly-accessible full text available September 1, 2023
-
Free, publicly-accessible full text available August 1, 2023
-
Free, publicly-accessible full text available August 1, 2023