Graduate level reasoning

Software / App

A benchmark assessing AI models' capabilities in complex reasoning typically associated with graduate-level studies.

Mentioned in 1 video