Scale Determines Whether Language Models Organize Representation Geometry for Prediction
Signal
78
Hype
15
In three linesStudy on how representation geometry organization in language models depends on scale. Subspace PGA metric tests alignment of intermediate geometry with unembedding matrix readout. Small models (≤1024) progressively lose organization at late layers during training, while large models (≥2048) preserve it throughout. Scale determines how geometry organizes for prediction.Read source
Your take?
Summary generated by Claude — human-verified