Reddit r/LocalLLaMA·20 May 2026

Build 9254 fixes my TG regression and adds PDL for NVIDIA GPUs

Signal

Hype

In three linesBuild 9254 of llama.cpp fixes throughput regression and adds PDL (Programmatic Dependent Launch) support for NVIDIA GPUs CC >= 90. PDL enables concurrent CUDA kernel execution on the same stream, reducing launch overhead. Observed gains: +3% on RTX 5060 Ti, up to +10% on RTX PRO 6000 depending on model.

Read source

Your take?

Infrastructure Open source Benchmarks

Summary generated by Claude — human-verified

Build 9254 fixes my TG regression and adds PDL for NVIDIA GPUs

Other angles on this story