K
Kimmy 2
BookMentioned in 1 video
A transformer model design cited in context of hardware-aware model architecture choices (attention heads, experts, etc.).
A transformer model design cited in context of hardware-aware model architecture choices (attention heads, experts, etc.).