Sparsely Gated Mixture of Experts
Concept
A machine learning model architecture composed of multiple 'expert' networks specialized for different tasks, with a gating mechanism to route queries, enabling efficient use of distributed compute resources.
Mentioned in 1 video
