From One-Pass SGD to Data Reuse: Mini-Batch Scaling Laws in Sketched Linear Regression
Signal
72
Hype
08
In three linesTheoretical study of scaling laws for sketched linear regression with mini-batches. Comparative analysis of one-pass SGD, multi-pass SGD with and without replacement. Key result: variance O(min(M,(T_eff*γ)^(1/a))/(B*T_eff)), 1/B reduction in multi-pass without-replacement regime, zero fluctuation at B=N.Read source
Your take?
Summary generated by Claude — human-verified