Healthbench

Software / App

A benchmark relevant for clinical diagnosis. GPT-5.5 outperforms GPT-5.4 on this benchmark, and a specialized version of GPT-5.4 for clinicians also shows strong performance.

Mentioned in 1 video