LMSYS Chatbot Arena
Software / App
A platform for evaluating chatbot performance through crowdsourced human preferences, used to assess LLMs like Llama 3.
Mentioned in 2 videos
Save the 2 videos on LMSYS Chatbot Arena to your own pod.
Sign up free to keep building your knowledge base on LMSYS Chatbot Arena as more episodes are added.
Videos Mentioning LMSYS Chatbot Arena

Training Llama 2, 3 & 4: The Path to Open Source AGI — with Thomas Scialom of Meta AI
Latent Space
A platform for evaluating chatbot performance through crowdsourced human preferences, used to assess LLMs like Llama 3.

Personal benchmarks vs HumanEval - with Nicholas Carlini of DeepMind
Latent Space
A platform for evaluating LLMs where prompts are often single-turn, contrasting with real-world multi-turn usage.