Model Evaluation
7 video summaries
Videos About Model Evaluation

Measuring Exponential Trends Rising (in AI) — Joel Becker, METR
Latent Space

Training Llama 2, 3 & 4: The Path to Open Source AGI — with Thomas Scialom of Meta AI
Latent Space

State of the Art: Training 70B LLMs on 10,000 H100 clusters
Latent Space

The Agent Reasoning Interface: Claude, ChatGPT Canvas, Tasks, Operator — with Karina Nguyen, OpenAI
Latent Space

⚡️Multi-Turn RL for Multi-Hour Agents — with Will Brown, Prime Intellect
Latent Space
![[Paper Club] SWE-Bench [OpenAI Verified/Multimodal] + MLE-Bench with Jesse Hu](https://i.ytimg.com/vi/ULcwHlxfSkQ/maxresdefault.jpg)
[Paper Club] SWE-Bench [OpenAI Verified/Multimodal] + MLE-Bench with Jesse Hu
Latent Space

Nuts and Bolts of Applying Deep Learning (Andrew Ng)
Lex Fridman