The Discrete-Log Clock: How a Transformer Learns Modular Multiplication
Signal
82
Hype
15
In three linesResearchers show that a transformer learning modular multiplication uses multiplicative character transform rather than standard DFT. On a·b mod 113, the spectrum becomes sparse (Gini 0.58 vs 0.07), with 96.9% of MLP neurons tuned to a single frequency. The algorithm implements a "Discrete-Log Clock" reducing multiplication to addition in discrete-log space.Read source
Your take?
Summary generated by Claude — human-verified