Back to feed
Reddit r/LocalLLaMA·

NVFP4 + MTP - voilà on llama.cpp

Signal
65
Hype
15
In three linesNVFP4 and MTP are now available together in llama.cpp (release b9297). This combination of quantization and optimization enables improved performance on NVIDIA GPUs.
Read source
Your take?
Open sourceInfrastructureCode generation

Summary generated by Claude — human-verified