GDP Eval
Software / AppMentioned in 1 video
An evaluation for measuring real-world white-collar work, used as a potential model for future coding benchmarks.
An evaluation for measuring real-world white-collar work, used as a potential model for future coding benchmarks.