MLU
Software / App
A benchmark, considered flawed but fascinating, that covers aptitude and knowledge across various domains.
Mentioned in 2 videos
Save the 2 videos on MLU to your own pod.
Sign up free to keep building your knowledge base on MLU as more episodes are added.
Videos Mentioning MLU

The Ultimate Guide to Prompting - with Sander Schulhoff from LearnPrompting.org
Latent Space
A benchmark used to test the efficacy of role prompting, where Sander's experiments showed the 'idiot prompt' outperformed the 'genius prompt'.

Gemini 2.5 Pro - It’s a Darn Smart Chatbot … (New Simple High Score)
AI Explained
A benchmark, considered flawed but fascinating, that covers aptitude and knowledge across various domains.