Back to feed
arXiv cs.LG·

One Mask to Rule Them All: On Hidden Facts after Editing and How to Find Them

Signal
75
Hype
15
In three linesKnowledge editing methods ROME and MEMIT modify transformer MLP weights. Authors identify a common subset of weights targeted across diverse edits using a binary mask that reverses 80% of edits on training set and 70% on test set. The mechanism suppresses rather than overwrites knowledge, explaining why changes fail to propagate to related facts.
Read source
Your take?
PapersReasoningAI safetyAlignment

Summary generated by Claude — human-verified