Gemma 9B

Software / AppMentioned in 1 video

A smaller model used in experiments to demonstrate that generating more data from it and distilling to a larger model can be more effective than using the larger model's own data.