Hellaswag
Study / ResearchMentioned in 1 video
A common NLP benchmark mentioned as one of the public evaluations that IMB has reviewed and cleaned for ambiguity and data contamination.
A common NLP benchmark mentioned as one of the public evaluations that IMB has reviewed and cleaned for ambiguity and data contamination.