I created an LLM post-training method called RPS. Preliminary results show that it improved Qwen3-8b's program synthesis reliability. [R]
Signal
62
Hype
35
In three linesRPS is a two-stage post-training method inspired by neuroplasticity: easy data with high learning rate, then hard data with 90% reduced rate. On Qwen3-8b, RPS achieves 4% on ARC-AGI 1 and 1145/1200 error-free program executions versus 2.4% and 870/1200 for EPS (equal rate).Read source
Your take?
Summary generated by Claude — human-verified