I
Infinity Bench QA
Tool / ProductMentioned in 1 video
A benchmark used for assessing long-context capabilities, with LLaMA 3.1 reportedly outperforming GPT-4, GPT-4o, and Claude 3.5 Sonic.
A benchmark used for assessing long-context capabilities, with LLaMA 3.1 reportedly outperforming GPT-4, GPT-4o, and Claude 3.5 Sonic.