arXiv cs.LG·25 May 2026

Steered Generation via Gradient-Based Optimization on Sparse Query Features

Signal

Hype

In three linesPrototype-Based Sparse Steering applies Sparse Autoencoders to LLM attention query activations to decompose representations into interpretable features. Gradient-based optimization during inference aligns sparse representations with target behavior prototypes. Validated on Textualized Gridworld (planning constraints) and educational domain (cognitive complexity via Bloom's Taxonomy).

Read source

Your take?

Reasoning Fine-tuning Papers

Summary generated by Claude — human-verified

Steered Generation via Gradient-Based Optimization on Sparse Query Features

Other angles on this story