skip to main content

Title: Constraining effective field theories with machine learning
An important part of the Large Hadron Collider (LHC) legacy will be precise limits on indirect effects of new physics, framed for instance in terms of an effective field theory. These measurements often involve many theory parameters and observables, which makes them challenging for traditional analysis methods. We discuss the underlying problem of “likelihood-free” inference and present powerful new analysis techniques that combine physics insights, statistical methods, and the power of machine learning. We have developed MadMiner, a new Python package that makes it straightforward to apply these techniques. In example LHC problems we show that the new approach lets us put stronger constraints on theory parameters than established methods, demonstrating its potential to improve the new physics reach of the LHC legacy measurements. While we present techniques optimized for particle physics, the likelihood-free inference formulation is much more general, and these ideas are part of a broader movement that is changing scientific inference in fields as diverse as cosmology, genetics, and epidemiology.
; ; ; ; ; ;
Doglioni, C.; Kim, D.; Stewart, G.A.; Silvestris, L.; Jackson, P.; Kamleh, W.
Award ID(s):
Publication Date:
Journal Name:
EPJ Web of Conferences
Page Range or eLocation-ID:
Sponsoring Org:
National Science Foundation
More Like this
  1. Abernethy, Jacob ; Agarwal, Shivani (Ed.)
    We study a variant of the sparse PCA (principal component analysis) problem in the “hard” regime, where the inference task is possible yet no polynomial-time algorithm is known to exist. Prior work, based on the low-degree likelihood ratio, has conjectured a precise expression for the best possible (sub-exponential) runtime throughout the hard regime. Following instead a statistical physics inspired point of view, we show bounds on the depth of free energy wells for various Gibbs measures naturally associated to the problem. These free energy wells imply hitting time lower bounds that corroborate the low-degree conjecture: we show that a classmore »of natural MCMC (Markov chain Monte Carlo) methods (with worst-case initialization) cannot solve sparse PCA with less than the conjectured runtime. These lower bounds apply to a wide range of values for two tuning parameters: temperature and sparsity misparametrization. Finally, we prove that the Overlap Gap Property (OGP), a structural property that implies failure of certain local search algorithms, holds in a significant part of the hard regime.« less
  2. A bstract One of the key tasks of any particle collider is measurement. In practice, this is often done by fitting data to a simulation, which depends on many parameters. Sometimes, when the effects of varying different parameters are highly correlated, a large ensemble of data may be needed to resolve parameter-space degeneracies. An important example is measuring the top-quark mass, where other physical and unphysical parameters in the simulation must be profiled when fitting the top-quark mass parameter. We compare four different methodologies for top-quark mass measurement: a classical histogram fit similar to one commonly used in experiment augmentedmore »by soft-drop jet grooming; a 2D profile likelihood fit with a nuisance parameter; a machine-learning method called DCTR; and a linear regression approach, either using a least-squares fit or with a dense linearly-activated neural network. Despite the fact that individual events are totally uncorrelated, we find that the linear regression methods work most effectively when we input an ensemble of events sorted by mass, rather than training them on individual events. Although all methods provide robust extraction of the top-quark mass parameter, the linear network does marginally best and is remarkably simple. For the top study, we conclude that the Monte-Carlo-based uncertainty on current extractions of the top-quark mass from LHC data can be reduced significantly (by perhaps a factor of 2) using networks trained on sorted event ensembles. More generally, machine learning from ensembles for parameter estimation has broad potential for collider physics measurements.« less
  3. Inferring the input parameters of simulators from observations is a crucial challenge with applications from epidemiology to molecular dynamics. Here we show a simple approach in the regime of sparse data and approximately correct models, which is common when trying to use an existing model to infer latent variables with observed data. This approach is based on the principle of maximum entropy (MaxEnt) and provably makes the smallest change in the latent joint distribution to fit new data. This method requires no likelihood or model derivatives and its fit is insensitive to prior strength, removing the need to balance observedmore »data fit with prior belief. The method requires the ansatz that data is fit in expectation, which is true in some settings and may be reasonable in all with few data points. The method is based on sample reweighting, so its asymptotic run time is independent of prior distribution dimension. We demonstrate this MaxEnt approach and compare with other likelihood-free inference methods across three systems: a point particle moving in a gravitational field, a compartmental model of epidemic spread and finally molecular dynamics simulation of a protein.« less
  4. Abstract

    The goal of this work is to predict the effect of part geometry and process parameters on the instantaneous spatial distribution of heat, called the heat flux or thermal history, in metal parts as they are being built layer-by-layer using additive manufacturing (AM) processes. In pursuit of this goal, the objective of this work is to develop and verify a graph theory-based approach for predicting the heat flux in metal AM parts. This objective is consequential to overcome the current poor process consistency and part quality in AM. One of the main reasons for poor part quality in metalmore »AM processes is ascribed to the heat flux in the part. For instance, constrained heat flux because of ill-considered part design leads to defects, such as warping and thermal stress-induced cracking. Existing non-proprietary approaches to predict the heat flux in AM at the part-level predominantly use mesh-based finite element analyses that are computationally tortuous — the simulation of a few layers typically requires several hours, if not days. Hence, to alleviate these challenges in metal AM processes, there is a need for efficient computational thermal models to predict the heat flux, and thereby guide part design and selection of process parameters instead of expensive empirical testing. Compared to finite element analysis techniques, the proposed mesh-free graph theory-based approach facilitates layer-by-layer simulation of the heat flux within a few minutes on a desktop computer. To explore these assertions we conducted the following two studies: (1) comparing the heat diffusion trends predicted using the graph theory approach, with finite element analysis and analytical heat transfer calculations based on Green’s functions for an elementary cuboid geometry which is subjected to an impulse heat input in a certain part of its volume, and (2) simulating the layer-by-layer deposition of three part geometries in a laser powder bed fusion metal AM process with: (a) Goldak’s moving heat source finite element method, (b) the proposed graph theory approach, and (c) further comparing the heat flux predictions from the last two approaches with a commercial solution. From the first study we report that the heat flux trend approximated by the graph theory approach is found to be accurate within 5% of the Green’s functions-based analytical solution (in terms of the symmetric mean absolute percentage error). Results from the second study show that the heat flux trends predicted for the AM parts using graph theory approach agrees with finite element analysis with error less than 15%. More pertinently, the computational time for predicting the heat flux was significantly reduced with graph theory, for instance, in one of the AM case studies the time taken to predict the heat flux in a part was less than 3 minutes using the graph theory approach compared to over 3 hours with finite element analysis. While this paper is restricted to theoretical development and verification of the graph theory approach for heat flux prediction, our forthcoming research will focus on experimental validation through in-process sensor-based heat flux measurements.

    « less

    We present cosmological parameter constraints based on a joint modelling of galaxy–lensing cross-correlations and galaxy clustering measurements in the SDSS, marginalizing over small-scale modelling uncertainties using mock galaxy catalogues, without explicit modelling of galaxy bias. We show that our modelling method is robust to the impact of different choices for how galaxies occupy dark matter haloes and to the impact of baryonic physics (at the $\sim 2{{\ \rm per\ cent}}$ level in cosmological parameters) and test for the impact of covariance on the likelihood analysis and of the survey window function on the theory computations. Applying our results tomore »the measurements using galaxy samples from BOSS and lensing measurements using shear from SDSS galaxies and CMB lensing from Planck, with conservative scale cuts, we obtain $S_8\equiv \left(\frac{\sigma _8}{0.8228}\right)^{0.8}\left(\frac{\Omega _\mathrm{ m}}{0.307}\right)^{0.6}=0.85\pm 0.05$ (stat.) using LOWZ × SDSS galaxy lensing, and S8 = 0.91 ± 0.1 (stat.) using combination of LOWZ and CMASS × Planck CMB lensing. We estimate the systematic uncertainty in the galaxy–galaxy lensing measurements to be $\sim 6{{\ \rm per\ cent}}$ (dominated by photometric redshift uncertainties) and in the galaxy–CMB lensing measurements to be $\sim 3{{\ \rm per\ cent}}$, from small-scale modelling uncertainties including baryonic physics.

    « less