Back to feed
arXiv cs.AI·

ISEP: Implicit Support Expansion for Offline Reinforcement Learning via Stochastic Policy Optimization

Signal
72
Hype
15
In three linesISEP proposes an offline reinforcement learning method that implicitly expands action support by interpolating between in-distribution data and policy samples. A stochastic mechanism alternates between conservative cloning and optimistic expansion signals, implemented via Conditional Flow Matching with classifier-free guidance.
Read source
Your take?
Reinforcement learningPapers

Summary generated by Claude — human-verified