HellaSwag
Software / App
A benchmark dataset designed to test common sense reasoning in LLMs by completing sentences adversarially generated.
Mentioned in 1 video
A benchmark dataset designed to test common sense reasoning in LLMs by completing sentences adversarially generated.