MAVEN A Multi-Agent Framework for Multicultural Text-to-Video Generation
Signal
72
Hype
28
In three linesMAVEN is a multi-agent prompt refinement framework improving cultural fidelity in text-to-video generation. It decomposes prompts into person, action, and location dimensions handled by specialized agents. Benchmark of 243 culturally grounded prompts and 972 videos (Chinese, American, Romanian) with CLIP and VLM-as-judge evaluation.Read source
Your take?
Summary generated by Claude — human-verified