CAREBench: Evaluating LLMs' Emotion Understanding by Assessing Cognitive Appraisal Reasoning
Signal
72
Hype
25
In three linesCAREBench is a benchmark evaluating LLMs' emotion understanding through cognitive appraisal reasoning. Tested on 6 models with complete inferential chain annotations (first/third-person perspectives), it shows stronger models match humans on some tasks but fall short on appraisal reasoning and positive emotion recognition.Read source
Your take?
Summary generated by Claude — human-verified