arXiv cs.AI·26 May 2026

MAPLE: Multi-State Aggregated Policy Evaluation for AlphaZero in Imperfect-Information Games

Signal

Hype

In three linesMAPLE, a tree search method, extends AlphaZero to imperfect-information games by aggregating policy and value evaluations from multiple sampled world states. Tested on Phantom Go and Dark Hex, MAPLE outperforms PIMC-AlphaZero baseline with Elo gains of 291 and 136.

Read source

Your take?

Reasoning Reinforcement learning Benchmarks

Summary generated by Claude — human-verified

MAPLE: Multi-State Aggregated Policy Evaluation for AlphaZero in Imperfect-Information Games

Other angles on this story