All benchmarks

HealthBench Professional

Health

HealthBench Professional is the physician-graded tier of OpenAI’s HealthBench, scoring model responses to realistic clinical scenarios against rubrics written by practicing doctors. It probes medical reasoning, safety and communication quality.

Model scores

  • Fable 566.0%
  • Opus 4.856.9%
  • GPT-5.551.8%
  • Opus 4.7
  • Gemini 3.1 Pro
  • Mythos Preview64.7%

Official source: HealthBench (OpenAI)

Related reading