skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The Machine Learning landscape of top taggers
Based on the established task of identifying boosted, hadronicallydecaying top quarks, we compare a wide range of modern machine learningapproaches. Unlike most established methods they rely on low-levelinput, for instance calorimeter output. While their networkarchitectures are vastly different, their performance is comparativelysimilar. In general, we find that these new approaches are extremelypowerful and great fun.  more » « less
Award ID(s):
1836650
PAR ID:
10167451
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; ; « less
Date Published:
Journal Name:
SciPost Physics
Volume:
7
Issue:
1
ISSN:
2542-4653
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Drawing from social capital theory, this study examines the extent to which stable versus new friendship patterns affect low income students’ educational aspirations in urban and rural high schools. Using whole school sociometric data (744 high school students over a two-year period), this study applies a social influence model to determine the effects of stable and newly established friendships on conformity regarding college-going aspirations. Findings indicate that urban students have more new friends and their educational aspirations increased, conforming to those of their newly established friends. In contrast, rural students have more stable friendships than the urban students and their educational aspirations conformed to those of their stable friends. This work shows that rural students tend not to change their school network size or nominations. However, urban students are more willing to include new students in their school networks which have a positive effect on raising their educational aspirations. 
    more » « less
  2. Continuous integration (CI) is a well-established technique in commercial and open-source software projects, although not routinely used in scientific publishing. In the scientific software context, CI can serve two functions to increase reproducibility of scientific results: providing an established platform for testing the reproducibility of these results, and demonstrating to other scientists how the code and data generate the published results. We explore scientific software testing and CI strategies using two articles published in the areas of applied mathematics and computational physics. We discuss lessons learned from reproducing these articles as well as examine and discuss existing tests. We introduce the notion of a scientific test as one that produces computational results from a published article. We then consider full result reproduction within a CI environment. If authors find their work too time or resource intensive to easily adapt to a CI context, we recommend the inclusion of results from reduced versions of their work (e.g., run at lower resolution, with shorter time scales, with smaller data sets) alongside their primary results within their article. While these smaller versions may be less interesting scientifically, they can serve to verify that published code and data are working properly. We demonstrate such reduction tests on the two articles studied. 
    more » « less
  3. Allen, Genevra (Ed.)
    Throughout the last decade, random forests have established themselves as among the most accurate and popular supervised learning methods. While their black-box nature has made their mathematical analysis difficult, recent work has established important statistical properties like consistency and asymptotic normality by considering subsampling in lieu of bootstrapping. Though such results open the door to traditional inference procedures, all formal methods suggested thus far place severe restrictions on the testing framework and their computational overhead often precludes their practical scientific use. Here we propose a hypothesis test to formally assess feature significance, which uses permutation tests to circumvent computationally infeasible estimates of nuisance parameters. This test is intended to be analogous to the F-test for linear regression. We establish asymptotic validity of the test via exchangeability arguments and show that the test maintains high power with orders of magnitude fewer computations. Importantly, the procedure scales easily to big data settings where large training and testing sets may be employed, conducting statistically valid inference without the need to construct additional models. Simulations and applications to ecological data, where random forests have recently shown promise, are provided. 
    more » « less
  4. Abstract Electron holes (EH) are localized modes in plasma kinetic theory which appear as vortices in phase space. Earlier research on EH is based on the Schamel distribution function (df). A novel df is proposed here, generalizing the original Schamel df in a recursive manner. Nonlinear solutions obtained by kinetic simulations are presented, with velocities twice the electron thermal speed. Using 1D-1V kinetic simulations, their propagation characteristics are traced and their stability is established by studying their long-time evolution and their behavior through mutual collisions. 
    more » « less
  5. HPC-ED is working to improve discovery and sharing of CyberTraining resources through the combination of the HPC-ED CyberTraining Catalog, an effective and flexible interface, thoughtful metadata design, and active community participation. HPC-ED encourages authors to share training resource information while retaining ownership and allows organizations to enrich their local portals with shared materials. By basing the architecture on an established, flexible framework, HPC-ED can provide a range of solutions people and organizations can employ for sharing and discovering materials. In this paper we describe the initial pilot phase of the project, where we prototyped the HPC-ED catalog, established an initial metadata set, provided documentation, and began using the system to share and discover materials. We gathered community feedback through a variety of means, and are now planning an implementation phase based on evolving our architecture and tools to meet community needs and feedback through improved interfaces and tools designed to address a range of preferences. 
    more » « less