K

Kimmy 2

Book

A transformer model design cited in context of hardware-aware model architecture choices (attention heads, experts, etc.).

Mentioned in 1 video