OmniDocBench

Study / Research

A popular benchmark for document understanding used by frontier and open-source models, considered somewhat saturated and rigid for agent evaluation.

Mentioned in 1 video