arXiv cs.LG·3 June 2026

GRZO: Group-Relative Zeroth-Order Optimization for Large Language Model Fine-Tuning

Signal

Hype

In three linesGRZO is a zeroth-order optimizer for memory-efficient LLM fine-tuning. It draws one perturbation per mini-batch example and aggregates losses via group-relative normalization, increasing effective gradient directions from one to batch size at no additional forward cost. On Llama3-8B, GRZO achieves +3.0 accuracy over MeZO with 23% lower peak GPU memory.

Read source

Your take?

Fine-tuning Papers Benchmarks

Summary generated by Claude — human-verified

GRZO: Group-Relative Zeroth-Order Optimization for Large Language Model Fine-Tuning

Other angles on this story