Back to feed
Reddit r/LocalLLaMA·

[OSS] dlmserve - first serving engine for diffusion language models

Signal
75
Hype
25
In three linesdlmserve is the first serving engine for diffusion language models (LLaDA, Dream-7B). Unlike autoregressive LLMs, they denoise a fully masked sentence in parallel. OpenAI-compatible API, continuous batching, 2.5x throughput vs HuggingFace at batch=4, runs in 12 GB VRAM. MIT licensed, pip install dlmserve.
Read source
Your take?
Open sourceCode generationInfrastructureTools

Summary generated by Claude — human-verified