arXiv cs.CL·26 May 2026

Faithful or Fabricated? A Causal Framework for Rationalization Bias in LLM Judges

Signal

Hype

In three linesStudy on rationalization bias in LLM judges. Researchers test whether model explanations remain stable when non-evidential cues are perturbed (verbosity, confidence). They propose PROOF-BEFORE-PREFERENCE to improve cue invariance and reduce explanation anchoring.

Read source

Your take?

Evals Reasoning Alignment

Summary generated by Claude — human-verified

Faithful or Fabricated? A Causal Framework for Rationalization Bias in LLM Judges

Other angles on this story