AgentBench

Study / ResearchMentioned in 1 video

A paper discussing agent evaluations, noted for examples in its appendix that were found to contain non-optimal solutions.