meituan-longcat/LongCat-Video-Avatar-1.5 · Hugging Face
Signal
75
Hype
35
In three linesMeituan releases LongCat-Video-Avatar 1.5, an open-source framework for audio-driven human avatar video generation. Upgrades audio encoder from Wav2Vec2 to Whisper-Large, supports Audio-Text-to-Video and Video Continuation with 8-step inference. Human evaluation on 508 image-audio pairs across 6 scenarios and 2 languages.Read source
Your take?
Summary generated by Claude — human-verified