Firefly: Illuminating Large-Scale Verified Tool-Call Data Generation from Real APIs
Signal
82
Hype
18
In three linesFireFly generates verified tool-call data for training agents from real MCP servers. The pipeline inverts standard synthesis: explores real APIs via DAG structures, then generates tasks backward from observed outcomes. 5,144 verified tasks across 240 servers and 993 tools. A 4B model trained with GRPO matches Claude Sonnet on held-out test set.Read source
Your take?
Summary generated by Claude — human-verified