Back to feed
arXiv cs.AI·

No Free Swap: Protocol-Dependent Layer Redundancy in Transformers

Signal
72
Hype
15
In three linesStudy showing that two protocols for evaluating layer redundancy in transformers (replacement and interchange) yield divergent results for identifying layers to prune. On Pythia, Qwen3-8B, and Llama-3.1-8B, the protocol gap dramatically changes which layers appear safe to remove, even under the same KL evaluator.
Read source
Your take?
PapersBenchmarksReasoning

Summary generated by Claude — human-verified