loader image

Ajla Karajko

Anthropic gives Claude the power to ‘hang up’

Anthropic has just equipped its Claude Opus 4 and 4.1 models with a feature that allows them to terminate a conversation if they determine the request is harmful or violent. This represents one of the first applications of the “model wellbeing” concept in commercial chatbots.

The new functionality only activates after Claude attempts to redirect the topic and provide helpful responses but fails to do so. The model can then independently end interactions involving sensitive content such as minors, terrorism, or violence. During testing, Opus 4 exhibited patterns similar to “distress” when handling such queries, autonomously terminating simulated harmful conversations.

It is important to note that users still have full access to their accounts — they can immediately start a new conversation or edit previous messages. Anthropic has also implemented protective mechanisms that prevent the model from ending conversations when the user shows signs of self-harm or imminent danger to others — in these cases, the model remains active and provides support.

Anthropic is one of the few companies investing seriously in researching AI wellbeing. Although science still lacks a clear answer to what “consciousness” in artificial intelligence would actually mean, these steps may one day be seen as important first attempts to establish boundaries and protective mechanisms for a technology unprecedented in human history.


In brief: Tech World Highlights

  • Eight Sleep announced a new USD 100 million funding round, with plans to develop the world’s first “Sleep Agent” for proactive recovery and sleep optimization.
  • Runway released a series of updates to its platform, including the addition of third-party models and visual enhancements in Chat Mode.
  • LM Arena introduced BiomedArena, a new evaluation track for testing and ranking LLM performance on real biomedical research.
  • OpenAI launched ChatGPT Go in India, a new subscription option at ₹399 per month (USD 4.60), significantly cheaper than the existing Plus plan, which costs around USD 20 per month.
  • Hon Hai Precision (Foxconn) will operate SoftBank’s factory in the U.S., launching the first production center for the Stargate data center worth USD 500 billion in collaboration with OpenAI and Oracle.


AI Trending Tools:

  • Wonda – Wondercraft’s AI agent for creating video and audio content.
  • April – AI tool for interacting with your inbox and calendar.
  • Qwen-Image-Edit – Qwen’s new model for image editing.

Podijeli objavu:

Preporučeni blogovi