loader image

Ajla Karajko

DeepMind’s robots learn to think aloud

Google DeepMind has introduced Gemini Robotics 1.5 and Gemini Robotics-ER 1.5, new AI models that allow robots to think before acting — transforming visual and language information into coordinated movements.

This system brings a revolutionary change: robots can now reason about multi-step tasks and explain their actions while performing them. For example, a robot can find recycling instructions online, analyze them, and then physically sort waste according to the learned rules.

Unlike traditional robots that merely execute commands, these models generate an internal reasoning process in natural language, breaking complex tasks into a series of smaller, logical steps. Gemini Robotics-ER 1.5 manages strategy and calls digital tools, while Gemini Robotics 1.5 converts these plans into precise motor commands.

This approach has shown top-tier results across 15 benchmark tests and works on a wide range of platforms — from dual-arm lab robots to humanoids.

DeepMind now offers the industry a unique software layer capable of operating almost any type of robot — ready to step from the lab into the real world. This marks a new phase in robot development where machines not only react but also think, plan, and explain their decisions in real time.


In brief: Tech World Highlights

  • Elon Musk’s The Boring Company has reportedly started testing Tesla’s Full Self-Driving system in the tunnels of the Las Vegas Convention Center, connecting nearby hotels.
  • Researchers at the University of Queensland achieved a world first by growing fully functional human skin in the laboratory.
  • Apple filed a lawsuit against a former Apple Watch team member, accusing him of sharing trade secrets with Chinese tech giant Oppo.
  • Morgan Stanley analysts predict that AI will affect 90% of jobs in the U.S. and generate nearly $1 trillion in annual savings for companies.
  • Instagram is finally launching a dedicated iPad app that defaults to the Reels feed, ending years of complaints about the lack of a tablet-optimized experience.


AI Trending Tools:

  • Co-STORM – Writing Wikipedia-style articles from scratch with AI-assisted research.
  • Hunyuan-A13B – Tencent’s new open-source model for hybrid reasoning.
  • Qwen VLo – Alibaba’s GPT-4o-like model for image generation and editing.

Podijeli objavu:

Preporučeni blogovi