Machine Learning Evaluation
3 video summaries
Save these 3 videos on Machine Learning Evaluation to your own research pod.
Sign up free to start building a knowledge base on Machine Learning Evaluation and add more videos as they're deep-dived.
Videos About Machine Learning Evaluation

The End of SWE-Bench Verified — Mia Glaese & Olivia Watkins, OpenAI Frontier Evals
Latent Space

Best of 2024 in Agents (from #1 on SWE-Bench Full, Prof. Graham Neubig of OpenHands/AllHands)
Latent Space

Stanford CS547 HCI Seminar | Spring 2026 | HCI and Human-Centered AI for Digital Health
Stanford Online