Back to feed
arXiv cs.AI·

GTA: Generating Long-Horizon Tasks for Web Agents at Scale

Signal
78
Hype
22
In three linesGTA is a framework for automatically generating complex web tasks with executable trajectories. It combines crawling, retrieval, in-context generation, and quality control across 50+ websites (e-commerce, government, forums, news). The benchmark reveals a significant performance gap between humans and AI agents.
Read source
Your take?
AI AgentsBenchmarksPapersCode generation

Summary generated by Claude — human-verified