Back to feed
arXiv cs.AI·

Action-Gradient Monte Carlo Tree Search for Non-Parametric Continuous (PO)MDPs

Signal
72
Hype
18
In three linesAction-Gradient MCTS (AGMCTS) combines global tree search with local gradient-based action refinement for online planning in continuous spaces. Three theoretical contributions: action score gradient theorem, Multiple Importance Sampling Tree for sample reuse, tractable gradients via Area Formula. Outperforms state-of-the-art sample-based solvers on continuous MDP/POMDP benchmarks.
Read source
Your take?
ReasoningReinforcement learningPapers

Summary generated by Claude — human-verified