GGML
Software / AppMentioned in 2 videos
A library that implements ideas similar to Flash Attention, runnable on CPU and Mac.
Videos Mentioning GGML

FlashAttention-2: Making Transformers 800% faster AND exact
Latent Space
A library that implements ideas similar to Flash Attention, runnable on CPU and Mac.

Ep 18: Petaflops to the People — with George Hotz of tinycorp
Latent Space
GGML is a framework focused on Apple Silicon (M1), which George Hotz initially considered but then decided to focus on a more general approach.