Nemotron-Labs-Diffusion from NVIDIA
Signal
82
Hype
25
In three linesNVIDIA releases Nemotron-Labs-Diffusion, tri-mode model (AR, diffusion, self-speculation) in 3B/8B/14B sizes. Self-speculation combines diffusion drafting and AR verification with shared KV cache: 3× higher acceptance length vs Qwen3-8B-Eagle3, 2.2× speedup, 4× speedup on GB200 (1015 tok/sec with custom CUDA kernels).Read source
Your take?
Summary generated by Claude — human-verified