Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of TransformersIII, Hal Daumé (Ed.)Since hardware resources are limited, the objective of training deep learning models is typically to maximize accuracy subject to the time and memory constraints of training and inference. We study the impact of model size in this setting, focusing on Transformer models for NLP tasks that are limited by compute: self-supervised pretraining and high-resource machine translation. We first show that even though smaller Transformer models execute faster per iteration, wider and deeper models converge in significantly fewer steps. Moreover, this acceleration in convergence typically outpaces the additional computational overhead of using larger models. Therefore, the most compute-efficient training strategy ismore »
-
Free, publicly-accessible full text available January 1, 2023
-
Free, publicly-accessible full text available August 1, 2022
-
A bstract Jet production in lead-lead (PbPb) and proton-proton (pp) collisions at a nucleon-nucleon center-of-mass energy of 5.02 TeV is studied with the CMS detector at the LHC, using PbPb and pp data samples corresponding to integrated luminosities of 404 μ b − 1 and 27.4 pb − 1 , respectively. Jets with different areas are reconstructed using the anti- k T algorithm by varying the distance parameter R . The measurements are performed using jets with transverse momenta ( p T ) greater than 200 GeV and in a pseudorapidity range of |η| < 2. To reveal the mediummore »
-
A bstract We present the first study of charged-hadron production associated with jets originating from b quarks in proton-proton collisions at a center-of-mass energy of 5.02 TeV. The data sample used in this study was collected with the CMS detector at the CERN LHC and corresponds to an integrated luminosity of 27.4 pb − 1 . To characterize the jet substructure, the differential jet shapes, defined as the normalized transverse momentum distribution of charged hadrons as a function of angular distance from the jet axis, are measured for b jets. In addition to the jet shapes, the per-jet yields ofmore »