Back to feed
arXiv cs.CL·

HEED: Density-Weighted Residual Alignment for Hybrid Vision-Language Model Distillation

Signal
78
Hype
15
In three linesHEED introduces density-weighted residual alignment for distilling vision-language models (e.g., Qwen3-VL-8B) into hybrid Mamba-2/attention architectures. The method targets high-density patches (text, fine details) experiencing 3.6× larger residual drift. Results: +8.7 points OCRBench v2, +5.13 points average, 4.12× throughput, 68% memory savings.
Read source
Your take?
VisionFine-tuningBenchmarksCode generation

Summary generated by Claude — human-verified