A Queueing Theoretic Perspective on Low-Latency LLM Inference with Variable Token Length
- PAR ID:
- 10609627
- Publisher / Repository:
- 22nd International Symposium on Modeling and Optimization in Mobile, Ad hoc, and Wireless Networks (WiOpt)
- Date Published:
- Format(s):
- Medium: X
- Location:
- Seoul, South Korea
- Sponsoring Org:
- National Science Foundation
More Like this
No document suggestions found
An official website of the United States government

