LLM as a Judge

Concept

A technique where a large language model is used to evaluate the outputs of another AI model, a topic discussed extensively in the context of AI evals.

Mentioned in 1 video