loader image

Ajla Karajko

OpenAI tests AI against human workers across 44 jobs

OpenAI has introduced a new benchmark called GDPval, designed to evaluate how well artificial intelligence can match human professionals in real-world tasks. The test covered 44 professions — ranging from finance to healthcare — and included top-performing models such as GPT-5, Claude Opus 4.1, Gemini 2.5, and Grok 4, which were evaluated by experts based on the quality and accuracy of completed assignments.

In total, 1,320 tasks were analyzed, all prepared by professionals with an average of 14 years of experience across nine economic sectors. According to the results, Claude Opus 4.1 achieved the highest overall score with 47.6% of wins, excelling particularly in visual and presentation-based tasks, while GPT-5 led in technical precision and analytical reasoning.

OpenAI revealed that model performance has tripled within just 15 months, comparing progress from GPT-4o to GPT-5, highlighting the rapid pace of AI development and its growing ability to perform complex professional tasks.

What GDPval clearly demonstrates is that, although AI models are not yet ready to fully replace humans, they are already achieving professional-level competence in several fields. If this pace continues, the next major leap could arrive within months — opening a new chapter in how we work and collaborate with intelligent systems.


In brief: Tech World Highlights

  • Telo, a California-based startup and maker of the compact MT1 electric truck priced at $41,000, has raised $20 million in Series A funding, co-led by Tesla co-founder Marc Tarpenning and other investors.
  • NASA is considering the use of a nuclear strike as a last-resort defense against asteroid 2024 YR4, which currently has a 4% chance of colliding with the Moon in 2032.
  • Valon is hiring a Forward Deployed Engineer with a salary range of $130,000–$230,000 USD plus equity — working onsite with corporate clients to translate business needs into code. Locations: New York, San Francisco, and Seattle.
  • Spotify has announced new AI safety measures, including spam filters, anti-impersonation policies, and AI content disclosure requirements, revealing that over 75 million AI spam tracks have already been removed.
  • Meta has launched Vibes, a new AI video feed within the Meta AI app, allowing users to discover, create, and remix short AI videos with visual effects, music, and styles.


AI Trending Tools:

  • Kaggle Game Arena – A benchmark platform for testing LLMs in strategic, continuously evolving games.
  • ChatGPT – OpenAI’s AI assistant, now equipped with tools for detecting signs of mental distress.
  • Gemini Storybooks – Google’s AI now creates narrative picture books with built-in storytelling features.

Podijeli objavu:

Preporučeni blogovi