MBPP

Study / Research

A benchmark used in the industry to evaluate LLMs for coding tasks, focusing on functional correctness.

Mentioned in 2 videos