Model Evaluation
7 video summaries
Build a research pod on Model Evaluation.
7 videos curated. Save them to your own pod, ask any question across the body of expert opinion, and connect it to Claude or ChatGPT.
Videos About Model Evaluation

Measuring Exponential Trends Rising (in AI) — Joel Becker, METR
Latent Space

Training Llama 2, 3 & 4: The Path to Open Source AGI — with Thomas Scialom of Meta AI
Latent Space

State of the Art: Training 70B LLMs on 10,000 H100 clusters
Latent Space

The Agent Reasoning Interface: Claude, ChatGPT Canvas, Tasks, Operator — with Karina Nguyen, OpenAI
Latent Space

⚡️Multi-Turn RL for Multi-Hour Agents — with Will Brown, Prime Intellect
Latent Space
![[Paper Club] SWE-Bench [OpenAI Verified/Multimodal] + MLE-Bench with Jesse Hu](https://i.ytimg.com/vi/ULcwHlxfSkQ/maxresdefault.jpg)
[Paper Club] SWE-Bench [OpenAI Verified/Multimodal] + MLE-Bench with Jesse Hu
Latent Space

Nuts and Bolts of Applying Deep Learning (Andrew Ng)
Lex Fridman