Humanity's Last Exam

Study / Research

A benchmark designed to test obscure topics, where Claude Mythos, with tools, achieved nearly two-thirds of the questions right, surpassing other frontier models.

Mentioned in 1 video