Convergence Without Understanding: When Language Models Agree on Representations but Disagree on Reasoning
Signal
78
Hype
15
In three linesStudy of 16 language models (1.5B–72B parameters) showing representational convergence does not extend to reasoning processes. Models align more on collectively failed problems (CKA=0.897) than solved ones (CKA=0.830). Post-decision representations diverge sharply (CKA=0.274), and shared information exerts minimal causal influence (1.5–5.5% flip rate).Read source
Your take?
Summary generated by Claude — human-verified