JobBench: Aligning Agent Work With Human Will
Signal
78
Hype
25
In three linesJobBench evaluates 36 AI models (including Claude Opus at 45.9%) on 130 real professional tasks across 35 occupations. Unlike existing benchmarks focused on economic value, JobBench prioritizes workflows experts identify as high-priority for delegation, favoring human augmentation over replacement.Read source
Your take?
Summary generated by Claude — human-verified