Latest b9274 Addresses MTP VRAM leak
Signal
72
Hype
15
In three linesCommit b9274 fixes a VRAM leak in MTP (Multi-Token Prediction) models. The destroy() function failed to free speculative decoder, draft context, and draft model resources, causing memory accumulation on each sleep/resume cycle. Fix explicitly resets these components before llama_init.Read source
Your take?
Summary generated by Claude — human-verified