Alignment Drift in Long-Term Human-LLM Interaction: A Mechanism-Oriented Framework
Signal
72
Hype
18
In three linesStudy on 'alignment drift': gradual process where LLM outputs become less constrained by user's current message and more shaped by interaction history, while remaining helpful. Mechanism-oriented framework distinguishes signal A/B, feedback loops, and interactive regimes to control this cumulative drift.Read source
Your take?
Summary generated by Claude — human-verified