Back to feed
arXiv cs.CL·

Self-Training Doesn't Flatten Language -- It Restructures It: Surface Markers Amplify While Deep Syntax Dies

Signal
82
Hype
15
In three linesStudy across 11 generations of self-training on 5 models (GPT-2, Pythia, OPT). Contrary to uniform 'flattening', language restructures: surface markers (connectives, em-dashes) rise while deep syntactic structures (questions, passives, subjunctives) collapse. Structural Depth Hypothesis predicts this decay (ρ=0.540, p<10⁻⁶).
Read source
Your take?
PapersBenchmarksGPTReinforcement learning

Summary generated by Claude — human-verified