MMLU
Study / Research
MMLU dataset mentioned as a benchmark referenced in model evaluation discussions.
Mentioned in 3 videos
Videos Mentioning MMLU

State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
Lex Fridman
MMLU dataset mentioned as a benchmark referenced in model evaluation discussions.

A Comprehensive Overview of Large Language Models - Latent Space Paper Club
Latent Space
Massive Multitask Language Understanding, a broad benchmark covering diverse subjects to evaluate LLM knowledge.

SmartGPT: Major Benchmark Broken - 89.0% on MMLU + Exam's Many Errors
AI Explained