Hierarchical Variational Policies for Reward-Guided Diffusion
Signal
72
Hype
18
In three linesHierarchical variational framework for adapting pretrained diffusion models to reward-aligned objectives. Formulates test-time adaptation as a lightweight stochastic policy that amortizes per-step control. On 4x super-resolution: better perceptual quality with 5x faster inference than best baseline.Read source
Your take?
Summary generated by Claude — human-verified