Multi-Latent Attention
Concept
A DeepSeek innovation that compresses key and value projections to reduce KV cache size, improving efficiency.
Mentioned in 1 video
A DeepSeek innovation that compresses key and value projections to reduce KV cache size, improving efficiency.