Epoch AI

Organization

Creator of the 'Frontier Math' benchmark, on which O3 scored around 25%, and noted for exposing potential LLM scheming.

Mentioned in 1 video