CHI-Bench: Can AI Agents Automate End-to-End, Long-Horizon, Policy-Rich Healthcare Workflows?
Signal
82
Hype
15
In three linesCHI-Bench evaluates AI agent automation of complex healthcare workflows. Benchmark spans 3 domains (prior authorization, utilization management, care management) with 87 MCP tools and 1,290+ policy documents. Best result: 28% task resolution, 3.8% in single session.Read source
Your take?
Summary generated by Claude — human-verified