OpenAI Frontier Evals team

OrganizationMentioned in 1 video

The team at OpenAI responsible for developing and executing evaluations for frontier AI models, including SWE-Bench Verified and GDP-Eval.

Videos Mentioning OpenAI Frontier Evals team

Latent Space

The team at OpenAI responsible for developing and executing evaluations for frontier AI models, including SWE-Bench Verified and GDP-Eval.