Multi-Token Residual Prediction
Multi-Token Residual Prediction (MRP) is a lightweight module that accelerates diffusion language models by predicting logit residuals between consecutive denoising steps without re-running the backbone. Tested on SDAR 1.7B–8B, MRP achieves 1.42× lossless speedup in speculative decoding on reasoning and code generation benchmarks.