Back to feed
arXiv cs.LG·

MAAT: Multi-phase Adapter-Aware Targeted Unlearning

Signal
78
Hype
15
In three lines5WBENCH, a balanced 5,000-sample benchmark across 5W categories, reveals unlearning methods fail on causal (Why) questions. MAAT, a three-phase framework operating on LoRA weights, combines gradient-projected ascent, SVD rank pruning, and KL-hidden-state repair to simultaneously achieve high forgetting and retention on causal knowledge.
Read source
Your take?
Fine-tuningAI safetyAlignmentBenchmarksPapers

Summary generated by Claude — human-verified