Escape the Language Prior: Mitigating Late-Stage Modality Collapse in Audio Reasoning via Modality-Aware Policy Optimization
Signal
78
Hype
25
In three linesModality-Aware Policy Optimization (MAPO) addresses late-stage modality collapse in audio-text models during RL fine-tuning. The method concentrates policy gradients on modality-critical tokens via a modality relevance mask and adds an attention penalty to sustain cross-modal grounding. MAPO achieves SOTA on several complex audio reasoning benchmarks.Read source
Your take?
Summary generated by Claude — human-verified