Butterbench
Software / App
A benchmark designed to evaluate AI agents' performance in real-world domestic tasks using a robot, incorporating aspects like social intelligence and common sense.
Mentioned in 1 video
A benchmark designed to evaluate AI agents' performance in real-world domestic tasks using a robot, incorporating aspects like social intelligence and common sense.