Swedbench
Study / Research
A benchmark score published by Anthropic, which shows the improvement in agent performance when using sampling techniques (from 70% to 80%).
Mentioned in 1 video
A benchmark score published by Anthropic, which shows the improvement in agent performance when using sampling techniques (from 70% to 80%).