Sweet Bench

Concept

A benchmark used for evaluating reasoning capabilities of language models, where fine-tuning with reasoning data led to outperformance of OpenAI O1.

Mentioned in 2 videos

Save the 2 videos on Sweet Bench to your own pod.

Sign up free to keep building your knowledge base on Sweet Bench as more episodes are added.

Get Started Free