Back to feed
arXiv cs.AI·

Transformers Can Implement Preconditioned Richardson Iteration for In-Context Gaussian Kernel Regression

Signal
78
Hype
15
In three linesSoftmax-attention transformers can implement preconditioned Richardson iteration for in-context Gaussian kernel regression. Authors construct a single-head transformer with O(log(1/ε)) blocks achieving ε-accurate prediction on prompts of length N, where softmax attention produces a Gaussian-kernel operator and ReLU MLP layers perform local scalar arithmetic.
Read source
Your take?
ReasoningPapersBenchmarks

Summary generated by Claude — human-verified