Back to feed
Reddit r/LocalLLaMA·

DeepMind Just Dropped "DiffusionGemma" — Text Generation via Image-Style Diffusion Model

Signal
78
Hype
35
In three linesDeepMind releases DiffusionGemma, a 26B MoE model (3.8B active) under Apache 2.0. Instead of sequential token generation, it uses diffusion to refine 256 tokens simultaneously. Achieves 1000+ tokens/s on H100, 700+ on RTX 5090. Native integration with vLLM, Unsloth, HF Transformers.
Read source
Your take?
DeepMindCode generationOpen sourceInfrastructure

Summary generated by Claude — human-verified