Live CodeBench
Software / App
A type of fresh evaluation that scrapes new web pages or GitHub repositories to create evaluations past the training cutoff date of language models.
Mentioned in 1 video
A type of fresh evaluation that scrapes new web pages or GitHub repositories to create evaluations past the training cutoff date of language models.