Contiguous Decoding

Concept

A technique supported by SGLang that uses finite state machines (derived from schemas like JSON via tools like XGrammar) to control output, enabling faster decoding by skipping unnecessary tokens.

Mentioned in 1 video