An optimization technique for LLM serving, allowing multiple requests to be processed efficiently.
Latent Space