Chinchilla paper
Book
A paper on compute-optimal training, but it's noted that it specifically refers to pre-training compute optimal training, highlighting a shift towards inference compute optimality.
Mentioned in 2 videos
Save the 2 videos on Chinchilla paper to your own pod.
Sign up free to keep building your knowledge base on Chinchilla paper as more episodes are added.
Videos Mentioning Chinchilla paper

State of the Art: Training 70B LLMs on 10,000 H100 clusters
Latent Space
A research paper that introduced scaling laws for language models, referenced in the discussion of CARBS's ability to learn similar scaling laws for various hyperparameters.

2024 Year in Review: The Big Scaling Debate, the Four Wars of AI, Top Themes and the Rise of Agents
Latent Space
A paper on compute-optimal training, but it's noted that it specifically refers to pre-training compute optimal training, highlighting a shift towards inference compute optimality.