PlanningBench: Generating Scalable and Verifiable Planning Data for Evaluating and Training Large Language Models
Signal
78
Hype
22
In three linesPlanningBench is a framework for generating scalable and verifiable planning data. It abstracts 30+ task types and difficulty factors from real scenarios, then synthesizes problems with adaptive control and automatic verification. RL training on verified data improves performance on unseen benchmarks.Read source
Your take?
Summary generated by Claude — human-verified