Back to feed
arXiv cs.CL·

Retrieval, Reward, and Training Protocols: What Matters in Training Search Agents?

Signal
78
Hype
15
In three linesControlled empirical study on training search agents powered by LLMs. Authors isolate three dimensions: (1) data-coverage issue in Wikipedia 2018 corpus explains larger gains than algorithmic differences, (2) outcome-based rewards outperform process-based approaches, (3) analysis of training data diversity and search budget scaling. Code released.
Read source
Your take?
AI AgentsRAGReinforcement learningBenchmarksPapers

Summary generated by Claude — human-verified