Mobile LM
Study / Research
A paper by Meta that studies models under 1 billion parameters, finding depth is more important than width and that GQA helps.
Mentioned in 1 video
A paper by Meta that studies models under 1 billion parameters, finding depth is more important than width and that GQA helps.