arXiv cs.AI·28 May 2026

EgoBench: An Interactive Egocentric Multimodal Benchmark for Tool-Using Agents

Signal

Hype

In three linesEgoBench is an interactive multimodal benchmark for tool-using agents with 1,045 egocentric-video tasks across four daily scenarios. Eight SOTA video-MLLMs achieve only 30.62% accuracy at best, 19.43% average, exposing bottlenecks in visual perception and multi-hop reasoning.

Read source

Your take?

AI Agents Vision Benchmarks Multi-agent

Summary generated by Claude — human-verified

EgoBench: An Interactive Egocentric Multimodal Benchmark for Tool-Using Agents

Other angles on this story