arXiv cs.LG·25 May 2026

Anytime Training with Schedule-Free Spectral Optimization

Signal

Hype

In three linesSF-NorMuon, a schedule-free spectral optimizer, matches or exceeds tuned AdamW on 125M and 772M parameter language models without requiring a predefined learning-rate schedule. Theoretical proof of stationarity guarantee and identification of weight decay as essential for long-horizon stability.

Read source

Your take?

Reinforcement learning Benchmarks Papers

Summary generated by Claude — human-verified

Anytime Training with Schedule-Free Spectral Optimization

Other angles on this story