MMLU
Software / App
A benchmark mentioned as an example of 'PhD++ problems' in AI, which current models are surpassing, in contrast to the ARC benchmarks that normal people can solve.
Mentioned in 1 video
A benchmark mentioned as an example of 'PhD++ problems' in AI, which current models are surpassing, in contrast to the ARC benchmarks that normal people can solve.