GDP Eval
Tool / ProductMentioned in 1 video
An evaluation for measuring real-world white-collar work, used as a potential model for future coding benchmarks.
An evaluation for measuring real-world white-collar work, used as a potential model for future coding benchmarks.