Back to feed
arXiv cs.AI·

Unlearning Isn't Deletion: Investigating Reversibility of Machine Unlearning in LLMs

Signal
78
Hype
15
In three linesStudy shows unlearning in LLMs merely suppresses information: minimal fine-tuning restores original behavior. Representation-level analysis framework (PCA, CKA, Fisher information) reveals four forgetting regimes by reversibility and catastrophicity. Identifies cases of seemingly irreversible targeted forgetting.
Read source
Your take?
AI safetyAlignmentEvalsPapers

Summary generated by Claude — human-verified