Chinchilla paper

Book

A paper on compute-optimal training, but it's noted that it specifically refers to pre-training compute optimal training, highlighting a shift towards inference compute optimality.

Mentioned in 2 videos

Videos Mentioning Chinchilla paper

State of the Art: Training 70B LLMs on 10,000 H100 clusters

Latent Space

A research paper that introduced scaling laws for language models, referenced in the discussion of CARBS's ability to learn similar scaling laws for various hyperparameters.

2024 Year in Review: The Big Scaling Debate, the Four Wars of AI, Top Themes and the Rise of Agents

Latent Space

A paper on compute-optimal training, but it's noted that it specifically refers to pre-training compute optimal training, highlighting a shift towards inference compute optimality.