G

GDP‑Val benchmark

Study / ResearchMentioned in 1 video

A broad benchmark comparing LLM performance to domain experts across many white‑collar tasks (used as an AGI/benchmark reference).