arXiv cs.AI·25 May 2026

Co-ReAct: Rubrics as Step-Level Collaborators for ReAct Agents

Signal

Hype

In three linesCo-ReAct integrates step-level rubrics to guide ReAct agents in multi-step search-intensive reasoning tasks. A rubric generator trained with GRPO optimizes list-wise Spearman rank-correlation against multi-judge expert consensus. Measured improvements on DeepResearchBench and SQA-CS-V2 across 8B/14B and frontier models.

Read source

Your take?

AI Agents Reasoning Reinforcement learning Evals

Summary generated by Claude — human-verified

Co-ReAct: Rubrics as Step-Level Collaborators for ReAct Agents

Other angles on this story