MLE bench
Software / App
Machine Learning Engineer bench from Deep Research or the GPT-4o system card, measuring progress towards model self-improvement.
Mentioned in 1 video
Machine Learning Engineer bench from Deep Research or the GPT-4o system card, measuring progress towards model self-improvement.