HarmBench

Software / App

A benchmark focused on safety, testing if models refuse to generate harmful content when prompted.

Mentioned in 1 video