Build 9254 fixes my TG regression and adds PDL for NVIDIA GPUs
Signal
78
Hype
15
In three linesBuild 9254 of llama.cpp fixes throughput regression and adds PDL (Programmatic Dependent Launch) support for NVIDIA GPUs CC >= 90. PDL enables concurrent CUDA kernel execution on the same stream, reducing launch overhead. Observed gains: +3% on RTX 5060 Ti, up to +10% on RTX PRO 6000 depending on model.Read source
Your take?
Summary generated by Claude — human-verified