Open-ended design decisions
ConceptMentioned in 1 video
A capability that ideal coding benchmarks should measure, where models make reasonable choices in underspecified problems.
A capability that ideal coding benchmarks should measure, where models make reasonable choices in underspecified problems.