S-bench

Study / Research

A benchmark for AI models, where Co-op Labs' Genie model achieved high scores by fine-tuning GPT-4o, though its reasoning traces were withheld, mirroring OpenAI's later competitive approach.

Mentioned in 2 videos