Chinchilla paper

Book

A paper on compute-optimal training, but it's noted that it specifically refers to pre-training compute optimal training, highlighting a shift towards inference compute optimality.

Mentioned in 2 videos