Back to feed
arXiv cs.LG·

Geometric Asymmetry in MoE Specialization: Functional Decorrelation and Representational Overlap

Signal
78
Hype
15
In three linesStudy of geometric structure in Mixture-of-Experts (MoE) architectures using Jacobian-PCA-Grassmann framework. Analysis of Mistral and Qwen reveals asymmetry: strong functional decorrelation between experts but partially overlapping representations. Sparse routing (top-k) strengthens functional separation.
Read source
Your take?
MistralQwenPapersReasoning

Summary generated by Claude — human-verified