e
evals
ConceptMentioned in 1 video
Short for evaluations, used to assess agent performance and identify failure modes like hallucination and output formatting issues. The speaker aims to make evals a source of joy rather than a pain by using MCP servers.
