Back to feed
arXiv cs.AI·

Co-ReAct: Rubrics as Step-Level Collaborators for ReAct Agents

Signal
78
Hype
22
In three linesCo-ReAct integrates step-level rubrics to guide ReAct agents in multi-step search-intensive reasoning tasks. A rubric generator trained with GRPO optimizes list-wise Spearman rank-correlation against multi-judge expert consensus. Measured improvements on DeepResearchBench and SQA-CS-V2 across 8B/14B and frontier models.
Read source
Your take?
AI AgentsReasoningReinforcement learningEvals

Summary generated by Claude — human-verified