No Free Swap: Protocol-Dependent Layer Redundancy in Transformers
Signal
72
Hype
15
In three linesStudy showing that two protocols for evaluating layer redundancy in transformers (replacement and interchange) yield divergent results for identifying layers to prune. On Pythia, Qwen3-8B, and Llama-3.1-8B, the protocol gap dramatically changes which layers appear safe to remove, even under the same KL evaluator.Read source
Your take?
Summary generated by Claude — human-verified