Co-ReAct: Rubrics as Step-Level Collaborators for ReAct Agents
Signal
78
Hype
22
In three linesCo-ReAct integrates step-level rubrics to guide ReAct agents in multi-step search-intensive reasoning tasks. A rubric generator trained with GRPO optimizes list-wise Spearman rank-correlation against multi-judge expert consensus. Measured improvements on DeepResearchBench and SQA-CS-V2 across 8B/14B and frontier models.Read source
Your take?
Summary generated by Claude — human-verified