Back to feed
arXiv cs.CL·

Sparse Autoencoders Map Brain-LLM Alignment onto Cortical Semantic Topography

Signal
82
Hype
15
In three linesSparse autoencoders decompose GPT-2 XL and Llama-3.1-8B into 16K-32K interpretable features per layer. Semantic features alone recover 94% of peak encoding performance (r=0.285) and align with known cortical semantic organization (ρ=0.72, p<0.001). Results generalize across English, Chinese, and French.
Read source
Your take?
PapersGPTLlamaEvals

Summary generated by Claude — human-verified