Reddit r/LocalLLaMA·26 May 2026

qwen 3.6 27B AR-> Diffusion - local training on 5090

Signal

Hype

In three linesLocal fine-tuning experiment of Qwen 3.6 27B on RTX 5090 by converting autoregressive architecture to diffusion. Uses QLoRA and nvfp4 to reduce VRAM requirements (600GB → trainable on 5090). Builds on open-dllm (4x speedup on Qwen 2.5) and integrates d3LLM to optimize diffusion steps. No trained model yet, but forward pass validated.

Read source

Your take?

Qwen Fine-tuning Open source Code generation Reasoning

Summary generated by Claude — human-verified

qwen 3.6 27B AR-> Diffusion - local training on 5090

Other angles on this story