Back to feed
arXiv cs.LG·

Dropout Universality: Scaling Laws and Optimal Scheduling at the Edge-of-Chaos

Signal
72
Hype
15
In three linesMean-field theory of dropout as perturbation of critical signal propagation at edge of chaos. Authors derive scaling laws and show smooth activations and ReLU-like activations form distinct universality classes. Front-loaded dropout schedules reduce test loss at no extra computational cost.
Read source
Your take?
PapersReasoningEvals

Summary generated by Claude — human-verified