skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Modeling hadronization using machine learning
First-principle simulations are at the heart of the high-energy physics research program. They link the vast data output of multi-purpose detectors with fundamental theory predictions and interpretation. This review illustrates a wide range of applications of modern machine learning to event generation and simulation-based inference, including conceptional developments driven by the specific requirements of particle physics. New ideas and tools developed at the interface of particle physics and machine learning will improve the speed and precision of forward simulations, handle the complexity of collision data, and enhance inference as an inverse simulation problem.  more » « less
Award ID(s):
2103889
PAR ID:
10440877
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
arxiv.org
Date Published:
Journal Name:
SciPost Physics
Volume:
14
Issue:
3
ISSN:
2542-4653
Page Range / eLocation ID:
027
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. First-principle simulations are at the heart of the high-energy physics research program. They link the vast data output of multi-purpose detectors with fundamental theory predictions and interpretation. This review illustrates a wide range of applications of modern machine learning to event generation and simulation-based inference, including conceptional developments driven by the specific requirements of particle physics. New ideas and tools developed at the interface of particle physics and machine learning will improve the speed and precision of forward simulations, handle the complexity of collision data, and enhance inference as an inverse simulation problem. 
    more » « less
  2. Abstract In many scientific fields which rely on statistical inference, simulations are often used to map from theoretical models to experimental data, allowing scientists to test model predictions against experimental results. Experimental data is often reconstructed from indirect measurements causing the aggregate transformation from theoretical models to experimental data to be poorly-described analytically. Instead, numerical simulations are used at great computational cost. We introduce Optimal-Transport-based Unfolding and Simulation (OTUS), a fast simulator based on unsupervised machine-learning that is capable of predicting experimental data from theoretical models. Without the aid of current simulation information, OTUS trains a probabilistic autoencoder to transform directly between theoretical models and experimental data. Identifying the probabilistic autoencoder’s latent space with the space of theoretical models causes the decoder network to become a fast, predictive simulator with the potential to replace current, computationally-costly simulators. Here, we provide proof-of-principle results on two particle physics examples, Z -boson and top-quark decays, but stress that OTUS can be widely applied to other fields. 
    more » « less
  3. Abstract Data analyses in particle physics rely on an accurate simulation of particle collisions and a detailed simulation of detector effects to extract physics knowledge from the recorded data. Event generators together with ageant-based simulation of the detectors are used to produce large samples of simulated events for analysis by the LHC experiments. These simulations come at a high computational cost, where the detector simulation and reconstruction algorithms have the largest CPU demands. This article describes how machine-learning (ML) techniques are used to reweight simulated samples obtained with a given set of parameters to samples with different parameters or samples obtained from entirely different simulation programs. The ML reweighting method avoids the need for simulating the detector response multiple times by incorporating the relevant information in a single sample through event weights. Results are presented for reweighting to model variations and higher-order calculations in simulated top quark pair production at the LHC. This ML-based reweighting is an important element of the future computing model of the CMS experiment and will facilitate precision measurements at the High-Luminosity LHC. 
    more » « less
  4. Abstract We introduce the DaRk mattEr and Astrophysics with Machine learning and Simulations (DREAMS) project, an innovative approach to understanding the astrophysical implications of alternative dark matter (DM) models and their effects on galaxy formation and evolution. The DREAMS project will ultimately comprise thousands of cosmological hydrodynamic simulations that simultaneously vary over DM physics, astrophysics, and cosmology in modeling a range of systems—from galaxy clusters to ultra-faint satellites. Such extensive simulation suites can provide adequate training sets for machine-learning-based analyses. This paper introduces two new cosmological hydrodynamical suites of warm dark matter (WDM), each comprising 1024 simulations generated using thearepocode. One suite consists of uniform-box simulations covering a ( 25 h 1 Mpc ) 3 volume, while the other consists of Milky Way zoom-ins with sufficient resolution to capture the properties of classical satellites. For each simulation, the WDM particle mass is varied along with the initial density field and several parameters controlling the strength of baryonic feedback within the IllustrisTNG model. We provide two examples, separately utilizing emulators and convolutional neural networks, to demonstrate how such simulation suites can be used to disentangle the effects of DM and baryonic physics on galactic properties. The DREAMS project can be extended further to include different DM models, galaxy formation physics, and astrophysical targets. In this way, it will provide an unparalleled opportunity to characterize uncertainties on predictions for small-scale observables, leading to robust predictions for testing the particle physics nature of DM on these scales. 
    more » « less
  5. Lin, Weiwei; Jia, Zhen; Hunold, Sascha; Kang, Guoxin (Ed.)
    The pursuit of understanding fundamental particle interactions has reached unparalleled precision levels. Particle physics detectors play a crucial role in generating low-level object signatures that encode collision physics. However, simulating these particle collisions is computational and memory intensive which will be exasperated with larger data volumes, more complex detectors, and a higher pileup environment in the High-Luminosity Large Hadron Collider. The introduction of Fast Simulation has been pivotal in overcoming computational and memory bottlenecks. The use of deep-generative models has sparked a surge of interest in surrogate modeling for detector simulations, generating particle showers that closely resemble the observed data. Nonetheless, there is a pressing need for a comprehensive evaluation of the performance of such generative models using a standardized set of metrics. In this study, we conducted a rigorous evaluation of three generative models using standard datasets and a diverse set of metrics derived from physics, computer vision, and statistics. Furthermore, we explored the impact of using full versus mixed precision modes during inference. Our evaluation revealed that the CaloDiffusion and CaloScore generative models demonstrate the most accurate simulation of particle showers, yet there remains substantial room for improvement. Our findings identified where the evaluated models fell short in accurately replicating Geant4 data. 
    more » « less