flash attention

Software / App

An optimized implementation of transformer attention, which Manifest AI's Vidril framework can match or outperform, especially on non-standard problem shapes.

Mentioned in 4 videos

Videos Mentioning flash attention

Answer.ai & AI Magic with Jeremy Howard

Latent Space

An optimized attention mechanism for Transformers, its compatibility issues with newer versions of Transformers were discussed.

Building an open AI company - with Ce and Vipul of Together AI

Latent Space

An optimization technique for attention mechanisms in transformers, open-sourced and contributing to better AI models.

Ep 18: Petaflops to the People — with George Hotz of tinycorp

Latent Space

Flash Attention is highlighted as an algorithmic trick that improves efficiency without increasing compute, similar to Hotz's approach with Tiny Grad.

⚡️ Beyond Transformers with Power Retention

Latent Space

An optimized implementation of transformer attention, which Manifest AI's Vidril framework can match or outperform, especially on non-standard problem shapes.