A Data-Efficient Path to Multilingual LLMs: Language Expansion via Post-training PARAM$\Delta$ Integration into Upcycled MoE
Signal
75
Hype
25
In three linesMethod to expand LLMs to new languages without costly alignment phase. Converts dense model to Mixture-of-Experts architecture with language-specific experts, then transfers alignment capabilities via post-training delta fusion. Improves performance on new languages while preserving original abilities.Read source
Your take?
Summary generated by Claude — human-verified