Back to feed
Reddit r/LocalLLaMA·

Latest LM Studio update killed MTP performance

Signal
45
Hype
25
In three linesUser reports LM Studio update from 0.4.14 to 0.4.17 degraded MTP (Multi-Token Prediction) performance on RTX 5090. Throughput dropped from ~100 tokens/s with MTP enabled back to ~70 tokens/s after update and CUDA runtime refresh.
Read source
Your take?
ToolsInfrastructure

Summary generated by Claude — human-verified