skip to main content

This content will become publicly available on September 1, 2022

Title: Parameter inference from event ensembles and the top-quark mass
A bstract One of the key tasks of any particle collider is measurement. In practice, this is often done by fitting data to a simulation, which depends on many parameters. Sometimes, when the effects of varying different parameters are highly correlated, a large ensemble of data may be needed to resolve parameter-space degeneracies. An important example is measuring the top-quark mass, where other physical and unphysical parameters in the simulation must be profiled when fitting the top-quark mass parameter. We compare four different methodologies for top-quark mass measurement: a classical histogram fit similar to one commonly used in experiment augmented by soft-drop jet grooming; a 2D profile likelihood fit with a nuisance parameter; a machine-learning method called DCTR; and a linear regression approach, either using a least-squares fit or with a dense linearly-activated neural network. Despite the fact that individual events are totally uncorrelated, we find that the linear regression methods work most effectively when we input an ensemble of events sorted by mass, rather than training them on individual events. Although all methods provide robust extraction of the top-quark mass parameter, the linear network does marginally best and is remarkably simple. For the top study, we conclude that the more » Monte-Carlo-based uncertainty on current extractions of the top-quark mass from LHC data can be reduced significantly (by perhaps a factor of 2) using networks trained on sorted event ensembles. More generally, machine learning from ensembles for parameter estimation has broad potential for collider physics measurements. « less
; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Journal of High Energy Physics
Sponsoring Org:
National Science Foundation
More Like this
  1. The measurement of the charge asymmetry for highly boosted top quark pairs decaying to a single lepton and jets is presented. The analysis is performed using 138 fb−1 of data collected in pp collisions at s√=13 TeV with the CMS detector during Run 2 of the Large Hadron Collider. The selection is optimized for top quark-antiquark pairs produced with large Lorentz boosts, resulting in non-isolated leptons and overlapping jets. The top quark charge asymmetry is measured for events with tt⎯⎯ invariant mass larger than 750 GeV and corrected for detector and acceptance effects using a binned maximum likelihood fit. Themore »measured top quark charge asymmetry is in good agreement with the standard model prediction at next-to-next-to-leading order in perturbation theory with next-to-leading order electroweak corrections. Differential distributions for two invariant mass ranges are also presented.« less
  2. Abstract The rate for Higgs ( $${\mathrm{H}} $$ H ) bosons production in association with either one ( $${\mathrm{t}} {\mathrm{H}} $$ t H ) or two ( $${\mathrm{t}} {{\overline{{{\mathrm{t}}}}}} {\mathrm{H}} $$ t t ¯ H ) top quarks is measured in final states containing multiple electrons, muons, or tau leptons decaying to hadrons and a neutrino, using proton–proton collisions recorded at a center-of-mass energy of $$13\,\text {TeV} $$ 13 TeV by the CMS experiment. The analyzed data correspond to an integrated luminosity of 137 $$\,\text {fb}^{-1}$$ fb - 1 . The analysis is aimed at events that contain $${\mathrm{H}} \rightarrowmore »{\mathrm{W}} {\mathrm{W}} $$ H → W W , $${\mathrm{H}} \rightarrow {\uptau } {\uptau } $$ H → τ τ , or $${\mathrm{H}} \rightarrow {\mathrm{Z}} {\mathrm{Z}} $$ H → Z Z decays and each of the top quark(s) decays either to lepton+jets or all-jet channels. Sensitivity to signal is maximized by including ten signatures in the analysis, depending on the lepton multiplicity. The separation among $${\mathrm{t}} {\mathrm{H}} $$ t H , $${\mathrm{t}} {{\overline{{{\mathrm{t}}}}}} {\mathrm{H}} $$ t t ¯ H , and the backgrounds is enhanced through machine-learning techniques and matrix-element methods. The measured production rates for the $${\mathrm{t}} {{\overline{{{\mathrm{t}}}}}} {\mathrm{H}} $$ t t ¯ H and $${\mathrm{t}} {\mathrm{H}} $$ t H signals correspond to $$0.92 \pm 0.19\,\text {(stat)} ^{+0.17}_{-0.13}\,\text {(syst)} $$ 0.92 ± 0.19 (stat) - 0.13 + 0.17 (syst) and $$5.7 \pm 2.7\,\text {(stat)} \pm 3.0\,\text {(syst)} $$ 5.7 ± 2.7 (stat) ± 3.0 (syst) of their respective standard model (SM) expectations. The corresponding observed (expected) significance amounts to 4.7 (5.2) standard deviations for $${\mathrm{t}} {{\overline{{{\mathrm{t}}}}}} {\mathrm{H}} $$ t t ¯ H , and to 1.4 (0.3) for $${\mathrm{t}} {\mathrm{H}} $$ t H production. Assuming that the Higgs boson coupling to the tau lepton is equal in strength to its expectation in the SM, the coupling $$y_{{\mathrm{t}}}$$ y t of the Higgs boson to the top quark divided by its SM expectation, $$\kappa _{{\mathrm{t}}}=y_{{\mathrm{t}}}/y_{{\mathrm{t}}}^{\mathrm {SM}}$$ κ t = y t / y t SM , is constrained to be within $$-0.9< \kappa _{{\mathrm{t}}}< -0.7$$ - 0.9 < κ t < - 0.7 or $$0.7< \kappa _{{\mathrm{t}}}< 1.1$$ 0.7 < κ t < 1.1 , at 95% confidence level. This result is the most sensitive measurement of the $${\mathrm{t}} {{\overline{{{\mathrm{t}}}}}} {\mathrm{H}} $$ t t ¯ H production rate to date.« less
  3. Propensity score methods account for selection bias in observational studies. However, the consistency of the propensity score estimators strongly depends on a correct specification of the propensity score model. Logistic regression and, with increasing popularity, machine learning tools are used to estimate propensity scores. We introduce a stacked generalization ensemble learning approach to improve propensity score estimation by fitting a meta learner on the predictions of a suitable set of diverse base learners. We perform a comprehensive Monte Carlo simulation study, implementing a broad range of scenarios that mimic characteristics of typical data sets in educational studies. The population averagemore »treatment effect is estimated using the propensity score in Inverse Probability of Treatment Weighting. Our proposed stacked ensembles, especially using gradient boosting machines as a meta learner trained on a set of 12 base learner predictions, led to superior reduction of bias compared to the current state-of-the-art in propensity score estimation. Further, our simulations imply that commonly used balance measures (averaged standardized absolute mean differences) might be misleading as propensity score model selection criteria. We apply our proposed model - which we call GBM-Stack - to assess the population average treatment effect of a Supplemental Instruction (SI) program in an introductory psychology (PSY 101) course at San Diego State University. Our analysis provides evidence that moving the whole population to SI attendance would on average lead to 1.69 times higher odds to pass the PSY 101 class compared to not offering SI, with a 95% bootstrap confidence interval of (1.31, 2.20).« less
  4. Abstract The production cross-section of a top quark in association with a W boson is measured using proton–proton collisions at $$\sqrt{s} = 8\,\text {TeV}$$ s = 8 TeV . The dataset corresponds to an integrated luminosity of $$20.2\,\text {fb}^{-1}$$ 20.2 fb - 1 , and was collected in 2012 by the ATLAS detector at the Large Hadron Collider at CERN. The analysis is performed in the single-lepton channel. Events are selected by requiring one isolated lepton (electron or muon) and at least three jets. A neural network is trained to separate the tW signal from the dominant $$t{\bar{t}}$$ t tmore »¯ background. The cross-section is extracted from a binned profile maximum-likelihood fit to a two-dimensional discriminant built from the neural-network output and the invariant mass of the hadronically decaying W boson. The measured cross-section is $$\sigma _{tW} = 26 \pm 7\,\text {pb}$$ σ tW = 26 ± 7 pb , in good agreement with the Standard Model expectation.« less
  5. Recent self-propagating malware (SPM) campaigns compromised hundred of thousands of victim machines on the Internet. It is challenging to detect these attacks in their early stages, as adversaries utilize common network services, use novel techniques, and can evade existing detection mechanisms. We propose PORTFILER (PORT-Level Network Traffic ProFILER), a new machine learning system applied to network traffic for detecting SPM attacks. PORTFILER extracts port-level features from the Zeek connection logs collected at a border of a monitored network, applies anomaly detection techniques to identify suspicious events, and ranks the alerts across ports for investigation by the Security Operations Center (SOC).more »We propose a novel ensemble methodology for aggregating individual models in PORTFILER that increases resilience against several evasion strategies compared to standard ML baselines. We extensively evaluate PORTFILER on traffic collected from two university networks, and show that it can detect SPM attacks with different patterns, such as WannaCry and Mirai, and performs well under evasion. Ranking across ports achieves precision over 0.94 and false positive rates below 8 × 10−4 in the top 100 highly ranked alerts. When deployed on the university networks, PORTFILER detected anomalous SPM-like activity on one of the campus networks, confirmed by the university SOC as malicious. PORTFILER also detected a Mirai attack recreated on the two university networks with higher precision and recall than deep learning based autoencoder methods.« less