$\Psi$-Bench: Evaluating Persona-Sensitive Influencing in Persuasive Dialogues
Signal
72
Hype
28
In three linesΨ-Bench is a benchmark assessing LLMs' ability to persuade realistic users through conversation. 10 frontier models tested on 3 real-world scenarios. Access to user profiles yields 18.24% performance gain. Code available.Read source
Your take?
Summary generated by Claude — human-verified