Sigmoid Attention
Study / Research
A recent paper that studies the distribution of logits and attention weights, relevant to improving long context handling in language models.
Mentioned in 1 video
A recent paper that studies the distribution of logits and attention weights, relevant to improving long context handling in language models.