Unified Data Selection for LLM Reasoning
Signal
72
Hype
25
In three linesHES (High-Entropy Sum) is a training-free metric for selecting high-quality reasoning data in LLMs. Tested across SFT, RFT, and RL paradigms, it achieves full-dataset performance using only the top 20% of samples, significantly reducing computational overhead.Read source
Your take?
Summary generated by Claude — human-verified