Back to feed
arXiv cs.LG·

Training-Inference Kernel Contracts: Bounding Divergence in Post-Training and Deployment

Signal
45
Hype
15
In three linesTheoretical paper proposing kernel contracts to bound divergence between training and inference kernels in post-training. Framework specifying acceptable gaps in finite precision with numerical, statistical, and routing clauses. Derives bounds from logit drift to total-variation distance and applies to RL policy-gradient bias.
Read source
Your take?
Reinforcement learningPapersAlignment

Summary generated by Claude — human-verified