Back to feed
arXiv cs.LG·

The Discrete-Log Clock: How a Transformer Learns Modular Multiplication

Signal
82
Hype
15
In three linesResearchers show that a transformer learning modular multiplication uses multiplicative character transform rather than standard DFT. On a·b mod 113, the spectrum becomes sparse (Gini 0.58 vs 0.07), with 96.9% of MLP neurons tuned to a single frequency. The algorithm implements a "Discrete-Log Clock" reducing multiplication to addition in discrete-log space.
Read source
Your take?
ReasoningPapersEvals

Summary generated by Claude — human-verified