This content will become publicly available on June 5, 2026
KV cache is 1 bit per channel: efficient large language model inference with coupled quantization
- Award ID(s):
- 2434166
- PAR ID:
- 10614467
- Publisher / Repository:
- Curran Associates Inc.
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
No document suggestions found
An official website of the United States government
