Tensor Cache: Eviction-conditioned Associative Memory for Transformers
Signal
78
Hype
15
In three linesTensor Cache introduces a two-level cache for transformers: sliding-window local attention (L1) plus fixed-size outer-product fast-weight memory (L2) storing evicted KV pairs as a matrix. A learned gate fuses outputs. Improves memory-quality tradeoff on long-context models.Read source
Your take?
Summary generated by Claude — human-verified