K

Kimmy 2

BookMentioned in 1 video

A transformer model design cited in context of hardware-aware model architecture choices (attention heads, experts, etc.).