DECOR: Auditing LLM Deception via Information Manipulation Theory
Signal
78
Hype
25
In three linesDECOR is a multi-agent framework for auditing deception in LLMs by decomposing contexts into atomic informational units and scoring four manipulation dimensions (omission, focus-shifting, meaning-obscuring). Tested on 15 frontier models, it achieves state-of-the-art deception detection on single and multi-turn benchmarks with interpretable manipulation profiles.Read source
Your take?
Summary generated by Claude — human-verified