Back to feed
arXiv cs.AI·

From "Weak" Signals to Strong Models: Preference Delta Aggregation with LoRA Merging

Signal
78
Hype
25
In three linesPreference Delta Aggregation (PDA) aggregates weak preference signals from model pairs (e.g., Qwen3 4B vs 1.7B) via LoRA merging. Geometric Alignment Merging (GAM) aligns adapter subspaces before aggregation. On knowledge reasoning and agentic search benchmarks, PDA+GAM improves Qwen3 8B by +6.8 and +7.3 points respectively.
Read source
Your take?
QwenFine-tuningReinforcement learningPapersBenchmarks

Summary generated by Claude — human-verified