Back to feed
arXiv cs.LG·

Steered Generation via Gradient-Based Optimization on Sparse Query Features

Signal
72
Hype
18
In three linesPrototype-Based Sparse Steering applies Sparse Autoencoders to LLM attention query activations to decompose representations into interpretable features. Gradient-based optimization during inference aligns sparse representations with target behavior prototypes. Validated on Textualized Gridworld (planning constraints) and educational domain (cognitive complexity via Bloom's Taxonomy).
Read source
Your take?
ReasoningFine-tuningPapers

Summary generated by Claude — human-verified