Memorization Dynamics of Fill-in-the-Middle Pretraining
Signal
75
Hype
15
In three linesStudy of verbatim memorization during Fill-in-the-Middle (FIM) pretraining on Llama 3.2. FIM recovers more short or partial spans compared to standard LTR, with extraction growing linearly with repetitions. Suffix context is insufficient: memorization remains anchored in prefix context.Read source
Your take?
Summary generated by Claude — human-verified