Butterbench

Software / App

A benchmark designed to evaluate AI agents' performance in real-world domestic tasks using a robot, incorporating aspects like social intelligence and common sense.

Mentioned in 1 video