Self-Distilled Trajectory-Aware Boltzmann Modeling: Bridging the Training-Inference Discrepancy in Diffusion Language Models
Signal
75
Hype
15
In three linesTABOM, a post-training method for Diffusion Language Models, aligns optimization with the multi-step easy-to-hard decoding trajectory observed at inference. Via Boltzmann modeling of unmasking preferences, it derives a tractable pairwise ranking objective that reduces training-inference discrepancy and improves performance on new domains.Read source
Your take?
Summary generated by Claude — human-verified