flash attention

Software / App

An optimized implementation of transformer attention, which Manifest AI's Vidril framework can match or outperform, especially on non-standard problem shapes.

Mentioned in 4 videos