ToolGate: Token-Efficient Pre-Call Control for Tool-Augmented Vision-Language Agents
ToolGate is a lightweight controller that decides whether to execute or skip tool calls proposed by vision-language agents. Across five benchmarks with Qwen3-VL, it reduces token cost to 64-69% of the ReAct baseline while preserving accuracy, and further improves accuracy by 1.65 points with matched-domain training.