Reddit r/LocalLLaMA·10 June 2026

DeepMind Just Dropped "DiffusionGemma" — Text Generation via Image-Style Diffusion Model

Signal

Hype

In three linesDeepMind releases DiffusionGemma, a 26B MoE model (3.8B active) under Apache 2.0. Instead of sequential token generation, it uses diffusion to refine 256 tokens simultaneously. Achieves 1000+ tokens/s on H100, 700+ on RTX 5090. Native integration with vLLM, Unsloth, HF Transformers.

Read source

Your take?

DeepMind Code generation Open source Infrastructure

Summary generated by Claude — human-verified

DeepMind Just Dropped "DiffusionGemma" — Text Generation via Image-Style Diffusion Model

Other angles on this story