arXiv cs.AI·19 May 2026

Minor First, Major Last: A Depth-Induced Implicit Bias of Sharpness-Aware Minimization

Signal

Hype

In three linesStudy of implicit bias of Sharpness-Aware Minimization (SAM) on linear diagonal networks for binary classification. For L=1, both ℓ∞-SAM and ℓ2-SAM recover ℓ2 max-margin classifier like gradient descent. At L=2, ℓ2-SAM exhibits "sequential feature amplification": predictor initially relies on minor coordinates then shifts to major ones, contrasting with GD behavior.

Read source

Your take?

Reasoning Papers

Summary generated by Claude — human-verified

Minor First, Major Last: A Depth-Induced Implicit Bias of Sharpness-Aware Minimization

Other angles on this story