ComBench: A Benchmark for Rigorous Proof Reasoning and Constructive Realization in Olympiad-Level Combinatorics
Signal
82
Hype
15
In three linesComBench is a benchmark of 100 Olympiad-level combinatorics problems to evaluate LLM mathematical reasoning. It distinguishes analysis-centric problems (rigorous proofs) from construction-centric problems (explicit constructions). Top models reach 65.4% average and 75.3% Best@4. Kimi-K2.6 outperforms GPT-4o on constructions but trails on proof grading.Read source
Your take?
Summary generated by Claude — human-verified