Review Arcade: On the Human Alignment and Gameability of LLM Reviews
Signal
75
Hype
25
In three linesEmpirical study on LLM-generated reviews for scientific papers (ACL Rolling Review 2025 data). Findings: limited alignment between LLM and human reviews, substantial variation across prompts and models. Authors can 'game' LLM reviews through iterative revision workflows, increasing scores for up to 35% of tested papers.Read source
Your take?
Summary generated by Claude — human-verified