Calibrated Preference Learning: The Case of Label Ranking
Signal
72
Hype
15
In three linesFormal study of calibration for probabilistic label ranking. Authors define a hierarchy of notions (full rankings, sub-rankings, top-k) and show popular models are poorly calibrated. Application to RLHF reward models reveals calibration and accuracy are not perfectly correlated.Read source
Your take?
Summary generated by Claude — human-verified