Live CodeBench

Software / App

A type of fresh evaluation that scrapes new web pages or GitHub repositories to create evaluations past the training cutoff date of language models.

Mentioned in 1 video