Back to feed
arXiv cs.LG·

Feature Geometry of LoRA Adapters: A Sparse Autoencoder Analysis of Representational Divergence in Fine-Tuned Language Models

Signal
72
Hype
15
In three linesStudy of LoRA-induced representation geometry using Sparse Autoencoders on Gemma-2-9B. Researchers observe weak geometric alignment between LoRA feature dictionaries and pretrained SAEs, suggesting LoRA creates distinct representational structures in the residual stream.
Read source
Your take?
Fine-tuningAI safetyPapers

Summary generated by Claude — human-verified