AMARIS: A Memory-Augmented Rubric Improvement System for Rubric-Based Reinforcement Learning
Signal
75
Hype
15
In three linesAMARIS introduces persistent evaluation memory to improve rubrics in LLM RL fine-tuning. The system accumulates evaluation diagnostics over time, uses static and dynamic retrieval to contextualize rubric modifications, and adds ~5% time overhead. Experiments show consistent gains across closed and open-ended domains.Read source
Your take?
Summary generated by Claude — human-verified