Geometric Asymmetry in MoE Specialization: Functional Decorrelation and Representational Overlap
Signal
78
Hype
15
In three linesStudy of geometric structure in Mixture-of-Experts (MoE) architectures using Jacobian-PCA-Grassmann framework. Analysis of Mistral and Qwen reveals asymmetry: strong functional decorrelation between experts but partially overlapping representations. Sparse routing (top-k) strengthens functional separation.Read source
Your take?
Summary generated by Claude — human-verified