G
GDP Val Eval
Software / AppMentioned in 1 video
An evaluation created by OpenAI to measure real-world white-collar work by agents, serving as a potential model for future coding evaluations.
An evaluation created by OpenAI to measure real-world white-collar work by agents, serving as a potential model for future coding evaluations.