Back to feed
arXiv cs.CL·

MERIT: Matching Expertise via Rubric-Informed Training for Reviewer Assignment

Signal
78
Hype
22
In three linesMERIT is a two-stage framework for large-scale reviewer assignment. A 4B parameter model trained via RL assesses submission-reviewer fit using expertise rubrics guided by an LLM judge, then distills predictions into an embedding-based retriever. Outperforms larger general-purpose LLMs on LR-Bench and CMU Gold dataset.
Read source
Your take?
Reinforcement learningPapersBenchmarksTools

Summary generated by Claude — human-verified