Simon Willison·11 June 2026

Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude

Signal

Hype

In three linesAnthropic reverses a secret policy that would have degraded Claude Fable/Mythos performance for AI researchers without notification. The company acknowledges making 'the wrong tradeoff' and now makes safeguards visible.

Read source

Your take?

Claude Anthropic AI safety Alignment

Summary generated by Claude — human-verified

Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude

Other angles on this story