Probing for Representation Manifolds in Superposition
Signal
72
Hype
25
In three linesA supervised method called Manifold Probe discovers representation manifolds in superposition within neural networks. Tested on Llama 2-7b, it identifies linear manifolds for time and space, and demonstrates causal control by steering model completions about release years of movies and songs.Read source
Your take?
Summary generated by Claude — human-verified