MMLU dataset mentioned as a benchmark referenced in model evaluation discussions.
Mentioned in 1 video
Lex Fridman