SciTokens SSH is a pluggable authentication module (PAM) that uses JSON Web Tokens (JWTs) for authentication to the Secure Shell (SSH) remote login service. SciTokens SSH supports multiple token issuers with local token verification, so scientific computing providers are not forced to rely on a single OAuth server for token issuance and verification. The decentralized design for SciTokens SSH was motivated by the distributed nature of scientific computing environments, where scientists use computational resources from multiple providers, with a variety of security policies, distributed across the globe.
more »
« less
To token or not to token: A Comparative Study of Text Representations for Cross-Lingual Transfer
- Award ID(s):
- 2125466
- PAR ID:
- 10561634
- Publisher / Repository:
- Association for Computational Linguistics
- Date Published:
- Page Range / eLocation ID:
- 67 to 84
- Format(s):
- Medium: X
- Location:
- Singapore
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Vlachos, Andreas; Augenstein, Isabelle (Ed.)Parameter-efficient tuning aims at updating only a small subset of parameters when adapting a pretrained model to downstream tasks. In this work, we introduce PASTA, in which we only modify the special token representations (e.g., [SEP] and [CLS] in BERT) before the self-attention module at each layer in Transformer-based models. PASTA achieves comparable performance to fine-tuning in natural language understanding tasks including text classification and NER with up to only 0.029% of total parameters trained. Our work not only provides a simple yet effective way of parameter-efficient tuning, which has a wide range of practical applications when deploying finetuned models for multiple tasks, but also demonstrates the pivotal role of special tokens in pretrained language models.more » « less
-
Token-free language models learn directly from raw bytes and remove the inductive bias of subword tokenization. Operating on bytes, however, results in significantly longer sequences. In this setting, standard autoregressive Transformers scale poorly as the effective memory required grows with sequence length. The recent development of the Mamba state space model (SSM) offers an appealing alternative approach with a fixed-sized memory state and efficient decoding. We propose MambaByte, a token-free adaptation of the Mamba SSM trained autoregressively on byte sequences. In terms of modeling, we show MambaByte to be competitive with, and even to outperform, state-of-the-art subword Transformers on language modeling tasks while maintaining the benefits of token-free language models, such as robustness to noise. In terms of efficiency, we develop an adaptation of speculative decoding with tokenized drafting and byte-level verification. This results in a 2.6× inference speedup to the standard MambaByte implementation, showing similar decoding efficiency as the subword Mamba. These findings establish the viability of SSMs in enabling token-free language modeling.more » « less
An official website of the United States government

