EgoBench: An Interactive Egocentric Multimodal Benchmark for Tool-Using Agents
Signal
78
Hype
25
In three linesEgoBench is an interactive multimodal benchmark for tool-using agents with 1,045 egocentric-video tasks across four daily scenarios. Eight SOTA video-MLLMs achieve only 30.62% accuracy at best, 19.43% average, exposing bottlenecks in visual perception and multi-hop reasoning.Read source
Your take?
Summary generated by Claude — human-verified