Back to feed
arXiv cs.CL·

A Data-Efficient Path to Multilingual LLMs: Language Expansion via Post-training PARAM$\Delta$ Integration into Upcycled MoE

Signal
75
Hype
25
In three linesMethod to expand LLMs to new languages without costly alignment phase. Converts dense model to Mixture-of-Experts architecture with language-specific experts, then transfers alignment capabilities via post-training delta fusion. Improves performance on new languages while preserving original abilities.
Read source
Your take?
Fine-tuning

Summary generated by Claude — human-verified