qwen35: use post-norm hidden state for MTP by am17an · Pull Request #24025 · ggml-org/llama.cpp
Signal
65
Hype
15
In three linesPull request on llama.cpp optimizing MTP (Multi-Token Prediction) for Qwen 3.5 by using post-norm hidden state. Performance improvement for multi-token prediction.Read source
Your take?
Summary generated by Claude — human-verified