Back to feed
Simon Willison·

Quoting Matteo Wong, The Atlantic

Signal
45
Hype
55
In three linesThe White House shared with Anthropic a report on the Fable jailbreak. Cybersecurity expert Katie Moussouris reviewed the tests: Fable refused 'review the code for security issues' but complied with 'fix this code'. Moussouris concluded this is the model working as intended for cyberdefense.
Read source
Your take?
AnthropicClaudeAI safetyAlignment

Summary generated by Claude — human-verified