GRZO: Group-Relative Zeroth-Order Optimization for Large Language Model Fine-Tuning
Signal
78
Hype
15
In three linesGRZO is a zeroth-order optimizer for memory-efficient LLM fine-tuning. It draws one perturbation per mini-batch example and aggregates losses via group-relative normalization, increasing effective gradient directions from one to batch size at no additional forward cost. On Llama3-8B, GRZO achieves +3.0 accuracy over MeZO with 23% lower peak GPU memory.Read source
Your take?
Summary generated by Claude — human-verified