A Negative Result on Cross-Model Activation Transfer in a Pythia Multi-Hop Setting
Signal
72
Hype
15
In three linesStudy on activation transfer between language models (Pythia-160M to Pythia-410M). A linear translation layer strongly aligns hidden states (cosine similarity 0.97), but injecting translated activations does not improve downstream performance at inference time. Negative result: offline representational alignment is insufficient for useful causal communication.Read source
Your take?
Summary generated by Claude — human-verified