humanity's last exam
Software / App
A benchmark designed to test AI models on a comprehensive set of intense tasks, with the implication that passing them signifies AGI.
Mentioned in 1 video
A benchmark designed to test AI models on a comprehensive set of intense tasks, with the implication that passing them signifies AGI.