loader image

Ajla Karajko

GPT-5 blows past doctors on medical exams

New results from Emory University show that GPT-5 achieved remarkable performance on medical exams, surpassing both the previous GPT-4o and even physicians in tasks requiring diagnostic and multimodal reasoning.

The model scored 95.84% accuracy on the MedQA test, an increase of nearly five percentage points compared to GPT-4o’s best result. Even more impressive, on complex multimodal tasks combining patient history and medical imaging, GPT-5 achieved a 70% success rate — 30 points higher than GPT-4o.

Compared to pre-licensed healthcare workers, GPT-5 outperformed them by 24% in reasoning and 29% in comprehension on professional tests. The model also demonstrated sophisticated diagnostic capabilities, such as correctly identifying rare conditions — for example, Boerhaave syndrome — using lab results and CT scans.

These results mark a significant shift: while GPT-4o was close to human-level performance, GPT-5 already significantly surpasses physicians. Experts note that we are approaching a point where not utilizing AI in clinical practice could be considered a professional oversight. Given that the performance gap continues to widen, it is clear that AI is becoming an indispensable ally in the medicine of the future.


In brief: Tech World Highlights

  • Google announced a new Gemini-powered health assistant for Fitbit, capable of providing personalized fitness, sleep, and health advice based on user data.
  • Anthropic expanded availability of its agent-based programming tool Claude Code to Enterprise and Team packages, introducing new administrative controls for managing costs, policy settings, and more.
  • MIT’s NANDA initiative revealed that only 5% of enterprise AI implementations generate revenue, with knowledge gaps and poor integration hindering broader adoption of the technology.
  • OpenAI’s Sebastien Bubeck stated that GPT-5-pro can “prove new interesting mathematical theorems,” using the model to solve open complex problems.
  • Google product lead Logan Kilpatrick posted a banana emoji on X, hinting that the “nano-banana” photo-editing model being tested on LM Arena likely comes from Google.


AI Trending Tools:

  • Eleven Music API – Integration of high-quality music into products and workflows.
  • M3 Agent – ByteDance Seed’s multimodal agent with long-term memory.
  • Nemotron Nano 2 – Nvidia’s family of small, efficient reasoning models.

Podijeli objavu:

Preporučeni blogovi