arXiv cs.AI·19 May 2026

Transformers Can Implement Preconditioned Richardson Iteration for In-Context Gaussian Kernel Regression

Signal

Hype

In three linesSoftmax-attention transformers can implement preconditioned Richardson iteration for in-context Gaussian kernel regression. Authors construct a single-head transformer with O(log(1/ε)) blocks achieving ε-accurate prediction on prompts of length N, where softmax attention produces a Gaussian-kernel operator and ReLU MLP layers perform local scalar arithmetic.

Read source

Your take?

Reasoning Papers Benchmarks

Summary generated by Claude — human-verified

Transformers Can Implement Preconditioned Richardson Iteration for In-Context Gaussian Kernel Regression

Other angles on this story