Big Bench
Study / ResearchMentioned in 1 video
A benchmark with many diverse tasks, some of which are very abstract and unrealistic, which IMB generally avoids for core model evaluation.
A benchmark with many diverse tasks, some of which are very abstract and unrealistic, which IMB generally avoids for core model evaluation.