OmniDocBench
Study / Research
A popular benchmark for document understanding used by frontier and open-source models, considered somewhat saturated and rigid for agent evaluation.
Mentioned in 1 video
A popular benchmark for document understanding used by frontier and open-source models, considered somewhat saturated and rigid for agent evaluation.