Terminal Bench

Study / Research

A benchmark used to measure progress in raw intelligence for coding tasks, assessing capabilities relevant to software engineers.

Mentioned in 1 video