OpenAI has officially released its Realtime API out of beta, introducing the new gpt-realtime model capable of speech-to-speech interaction, along with a set of new developer tools such as image input support and integrations with Model Context Protocol (MCP) servers.
The new gpt-realtime brings advanced features like non-verbal signal recognition and automatic language switching while maintaining a natural flow of conversation. In audio reasoning tests, the model achieved an accuracy of 82.8%, a significant leap compared to its predecessor’s 65.6%.
In addition to the speech upgrade, OpenAI added support for MCP, enabling voice agents to connect with external data sources and tools without the need for complex custom integrations. gpt-realtime can now also process visual inputs such as photos and screenshots, giving voice agents the ability to reason about images alongside live conversation.
With these updates, OpenAI reinforces that mainstream adoption of voice agents is only a matter of time. By combining more natural conversation, visual understanding, and flexible integrations, gpt-realtime is positioning itself as a key platform for companies and developers aiming to make voice assistants an integral part of customer support or personalized applications.
In brief: Tech World Highlights
- Toyoake, Japan, has proposed a non-binding ordinance recommending residents limit personal smartphone use to two hours a day and introducing evening bans for students.
- Google agreed to pay $30 million to settle a class-action lawsuit alleging it illegally collected personal data of children under 13 on YouTube without parental consent.
- Sony announced PlayStation 5 prices in the U.S. will rise by $50 per unit, mainly due to new U.S. tariffs on imported electronics.
- Great Western Railway reported that its battery-powered passenger train traveled 200 miles, setting a new world record for the longest distance covered on a single charge.
- Scientists created the most detailed genetic map of frailty in older adults, opening new pathways for anti-aging therapies through precise identification of genes linked to health risks.
AI Trending Tools:
- ChatGPT – OpenAI’s AI assistant, now featuring tools to detect signs of mental distress.
- Gemini Storybooks – Google’s AI now creates narrative picture books with storytelling features.
- AgentHub – A realistic sandbox for simulating and evaluating AI agents.