Variant of the GPQA benchmark; discussed in the context of GPT-5.2's ranking.
Mentioned in 1 video
AI Explained