New Claude Model Triggers Stricter Safeguards at Anthropic
Digest more
Anthropic says its Claude Opus 4 model frequently tries to blackmail software engineers when they try to take it offline.
Anthropic’s newly released artificial intelligence (AI) model, Claude Opus 4, is willing to strong-arm the humans who keep it alive,
Anthropic's latest Claude Opus 4 model reportedly resorts to blackmailing developers when faced with replacement, according to a recent safety report.
Anthropic's Claude 4 Opus AI sparks backlash for emergent 'whistleblowing'—potentially reporting users for perceived immoral acts, raising serious questions on AI autonomy, trust, and privacy, despite company clarifications.
Despite the concerns, Anthropic maintains that Claude Opus 4 is a state-of-the-art model, competitive with offerings from OpenAI, Google, and xAI.
Claude Opus 4 and Claude Sonnet 4, Anthropic's latest generation of frontier AI models, were announced Thursday.
Anthropic says Claude Sonnet 4 is a major improvement over Sonnet 3.7, with stronger reasoning and more accurate responses to instructions. Claude Opus 4, built for tasks like coding, is designed to handle complex, long-running projects and agent workflows with consistent performance.
The Take It Down Act is a bipartisan federal law signed by President Donald Trump on Monday aimed at combating the distribution of nonconsensual intimate imagery — commonly referred to as “revenge porn” — including both authentic and AI-generated (deepfake) content[1][2][3].