$S^3$: Increasing GPU Utilization during Generative Inference for Higher Throughput
- Award ID(s):
- 2118985
- PAR ID:
- 10484983
- Publisher / Repository:
- Curran Associates
- Date Published:
- Journal Name:
- Advances in neural information processing systems
- ISSN:
- 1049-5258
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
No document suggestions found