Reddit r/LocalLLaMA·31 May 2026

mudler/Qwen3.6-35B-A3B-Claude-4.7-Opus-Reasoning-Distilled-APEX-MTP-GGUF just released !

Signal

Hype

In three linesMudler releases APEX GGUF quantizations of Qwen3.6-35B-A3B-Claude-4.7-Opus-Reasoning-Distilled with bundled MTP (multi-token prediction) head. Files enable self-speculative decoding via llama.cpp without separate draft model. Size +2.5% vs non-MTP version, MTP head quantized Q8_0 for high draft accuracy.

Read source

Your take?

Qwen Code generation Open source Tools Infrastructure

Summary generated by Claude — human-verified

mudler/Qwen3.6-35B-A3B-Claude-4.7-Opus-Reasoning-Distilled-APEX-MTP-GGUF just released !

Other angles on this story