METER
Software / App
Benchmark mentioned as a competitor/alternative to Subbench/CBench; long-horizon evaluation
Mentioned in 2 videos
Save the 2 videos on METER to your own pod.
Sign up free to keep building your knowledge base on METER as more episodes are added.
Videos Mentioning METER

The End of SWE-Bench Verified — Mia Glaese & Olivia Watkins, OpenAI Frontier Evals
Latent Space
Benchmark mentioned as a competitor/alternative to Subbench/CBench; long-horizon evaluation

Is AI About to “Eat Everything”? (It’s Not.)
Cal Newport
An organization that released an AI Safety and Evaluation update, including a famous AI time horizon chart.