Dropout Universality: Scaling Laws and Optimal Scheduling at the Edge-of-Chaos
Signal
72
Hype
15
In three linesMean-field theory of dropout as perturbation of critical signal propagation at edge of chaos. Authors derive scaling laws and show smooth activations and ReLU-like activations form distinct universality classes. Front-loaded dropout schedules reduce test loss at no extra computational cost.Read source
Your take?
Summary generated by Claude — human-verified