K

KV Cache

ConceptMentioned in 1 video

A caching mechanism for Transformers that stores previously computed keys and values, significantly speeding up token generation by avoiding redundant computations during subsequent passes.