Visual Agentic Memory: Enabling Online Long Video Understanding via Online Indexing, Hierarchical Memory, and Agentic Retrieval
Signal
72
Hype
25
In three linesVisual Agentic Memory (VAM) is a training-free framework for long video understanding. It combines online selective indexing, hierarchical memory, and agentic retrieval. On OVO-Bench, VAM achieves 68.41 (vs 67.46 for Gemini 3 Flash alone) and 17.11% on MM-Lifelong (105.6h over 51 days).Read source
Your take?
Summary generated by Claude — human-verified