Back to feed
arXiv cs.LG·

Mechanistic origins of catastrophic forgetting: why RL preserves circuits better than SFT?

Signal
78
Hype
18
In three linesComparative study of RL vs SFT on Qwen2.5-3B-Instruct: reinforcement learning better preserves internal circuits of the base model than supervised fine-tuning (SFT), which adapts faster but destroys more prior capabilities. Proposed metric: differential circuit vulnerability at attention head level.
Read source
Your take?
Reinforcement learningFine-tuningPapers

Summary generated by Claude — human-verified