December 9, 2025 - Ajla Karajko

Poetry prompts can bypass AI safety guardrails

New research from the Italian Icaro Lab reveals that dangerous prompts can be disguised as poetry, tricking even the most advanced AI models into generating harmful content — with some systems failing this trick every time. Icaro Lab tested 25 top models from leading companies such as OpenAI, Google, and Anthropic. Poetic prompts achieved an […]

OpenAI trains models to ‘confess’ when they cheat

OpenAI published new research on a technique called “Confessions”, which trains models to generate a second, fully honest output — in which they report rule violations, shortcuts, or deceptive solutions they used. After the main response, the model writes a separate “confession report” detailing all instructions received and whether they were actually followed. These confessions […]

Anthropic puts Claude to work as a research interviewer

Anthropic launched Anthropic Interviewer, a Claude-powered tool that conducts and analyzes qualitative interviews at scale — debuting with a study of 1,250 professionals on how AI affects their work. The tool covers the entire research process: preparing questions, conducting 10–15 minute interviews, and grouping themes for human analysts. In the first study, 86% of respondents […]

Anthropic surveys its own engineers on AI’s impact

Anthropic released an internal study based on responses from 132 engineers, revealing how AI tools have significantly changed daily work within the company — increasing productivity while also raising questions such as skill loss, reduced mentorship, and career uncertainty. Engineers report that they now use Claude for 60% of their tasks, with an estimated productivity […]

Leaked doc provides window into Claude’s ‘soul’

An internal document titled “Soul,” which describes the personality, ethical principles, and self-conception of the Claude model, surfaced publicly after a researcher extracted it from the Claude 4.5 Opus version — and Anthropic confirmed its authenticity and that it was used in training. The document establishes a hierarchy of Claude’s priorities: safety, ethics, internal company […]

Anthropic preps for IPO race with OpenAI

According to the Financial Times, Anthropic is beginning preparations for an IPO as early as 2026 — hiring the same law firm that led the IPOs of Google and LinkedIn, while investors are pushing for Claude’s creator to hit the market before OpenAI. Anthropic has reportedly engaged Wilson Sonsini, known for taking some of the […]

Google to build data centers in space in 2027

Google CEO Sundar Pichai unveiled an ambitious plan: under the “Suncatcher” project, the company will launch the first solar satellite data centers into space by 2027, potentially revolutionizing how AI infrastructure is built. Google plans to launch two prototypes in 2027 in partnership with Planet to test AI hardware under space conditions. The satellites will […]

China overtakes the U.S. in open AI economy

A new MIT and Hugging Face study, analyzing 2.2 billion downloads on the Hugging Face platform, reveals a major shift in the global AI landscape: Chinese companies have overtaken the dominance that American tech firms held for years. The study shows that Chinese AI developers have surpassed Americans in download share — 17.1% versus 15.8%. […]

‘Aristotle’ AI cracks 30-year math problem

Aristotle, Harmonic’s AI system, independently solved a 30-year-old version of Erdős Problem #124 — what researchers are calling the first true step into the era of “vibe proving” mathematics. The system reached the result in six hours and then formally verified the proof in Lean in just one minute. This was made possible by the […]

Day: 9 December 2025