f
fiction livebench for long context, deep comprehension
Tool / ProductMentioned in 1 video
A benchmark for evaluating AI models on their ability to process and understand long contexts, used to compare LLaMA 4's performance unfavorably.
A benchmark for evaluating AI models on their ability to process and understand long contexts, used to compare LLaMA 4's performance unfavorably.