benchmark scaling laws

Concept

A key innovation in the LLaMA 3 paper, allowing prediction of downstream task performance based on compute budget and training flops.

Mentioned in 1 video