Humanity's Last Exam
Study / Research
A benchmark designed to test obscure topics, where Claude Mythos, with tools, achieved nearly two-thirds of the questions right, surpassing other frontier models.
Mentioned in 2 videos
Save the 2 videos on Humanity's Last Exam to your own pod.
Sign up free to keep building your knowledge base on Humanity's Last Exam as more episodes are added.
Videos Mentioning Humanity's Last Exam

Claude Mythos: Highlights from 244-page Release
AI Explained
A benchmark designed to test obscure topics, where Claude Mythos, with tools, achieved nearly two-thirds of the questions right, surpassing other frontier models.

New Claude Opus 4.8: 15 Things You May’ve Missed
AI Explained
A reasoning benchmark where Opus 4.8 reportedly excels.