Back to feed
arXiv cs.CL·

AMARIS: A Memory-Augmented Rubric Improvement System for Rubric-Based Reinforcement Learning

Signal
75
Hype
15
In three linesAMARIS enhances rubric-based RL by integrating persistent evaluation memory. The system accumulates evaluation diagnostics over time, retrieves them via static and semantic search, and continuously adapts reward rubrics. Experiments show performance gains with ~5% time overhead.
Read source
Your take?
Reinforcement learningFine-tuningEvalsReasoning

Summary generated by Claude — human-verified