Enhancing Table Reasoning with Deterministic Table-State Rewards
Signal
78
Hype
15
In three linesTABROUGE, a deterministic reward metric based on Longest Common Subsequence, improves LLM table reasoning without training. RE-TAB, a plug-and-play framework using TABROUGE, gains 26.7 pp across six backbones and three benchmarks, reducing test-time scaling samples by 33%.Read source
Your take?
Summary generated by Claude — human-verified