SmallM2
Software / AppMentioned in 1 video
A series of best-in-class models, including a 1.7B parameter model that outperforms LLaMA 1B and QuIP 2.5; trained on 11 trillion tokens.
A series of best-in-class models, including a 1.7B parameter model that outperforms LLaMA 1B and QuIP 2.5; trained on 11 trillion tokens.