skip to main content

Search for: All records

Creators/Authors contains: "Wang, Xing"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available October 19, 2023
  2. Free, publicly-accessible full text available August 24, 2023
  3. Free, publicly-accessible full text available August 1, 2023
  4. Free, publicly-accessible full text available February 1, 2023
  5. Free, publicly-accessible full text available February 2, 2023
  6. In this paper, we consider hybrid parallelism—a paradigm that em- ploys both Data Parallelism (DP) and Model Parallelism (MP)—to scale distributed training of large recommendation models. We propose a compression framework called Dynamic Communication Thresholding (DCT) for communication-efficient hybrid training. DCT filters the entities to be communicated across the network through a simple hard-thresholding function, allowing only the most relevant information to pass through. For communication efficient DP, DCT compresses the parameter gradients sent to the parameter server during model synchronization. The threshold is updated only once every few thousand iterations to reduce the computational overhead of compression. For communication efficient MP, DCT incorporates a novel technique to compress the activations and gradients sent across the network during the forward and backward propagation, respectively. This is done by identifying and updating only the most relevant neurons of the neural network for each training sample in the data. We evaluate DCT on publicly available natural language processing and recommender models and datasets, as well as recommendation systems used in production at Facebook. DCT reduces communication by at least 100× and 20× during DP and MP, respectively. The algorithm has been deployed in production, and it improves end-to-end training time for amore »state-of-the-art industrial recommender model by 37%, without any loss in performance.« less
  7. In this review, we consider a general theoretical framework for fermionic color-singlet states—including a singlet, a doublet, and a triplet under the Standard Model SU(2) L gauge symmetry, corresponding to the bino, higgsino, and wino in supersymmetric theories—generically dubbed electroweakinos for their mass eigenstates. Depending on the relations among these states’ three mass parameters and their mixing after the electroweak symmetry breaking, this sector leads to a rich phenomenology that may be accessible in current and near-future experiments. We discuss the decay patterns of electroweakinos and their observable signatures at colliders, review the existing bounds on the model parameters, and summarize the current statuses of the comprehensive searches by the ATLAS and CMS Collaborations at the Large Hadron Collider. We also comment on the prospects for future colliders. An important feature of the theory is that the lightest neutral electroweakino can be identified as a weakly interacting massive particle cold dark matter candidate. We take into account the existing bounds on the parameters from the dark matter direct detection experiments and discuss the complementarity of the electroweakino searches at colliders.