DataClawBench: An Agent Benchmark for Exploratory Real-World Financial Data Analysis
Signal
78
Hype
15
In three linesDataClawBench is a benchmark for exploratory real-world financial data analysis agents, containing 2.06 million raw records and 492 cross-domain tasks. Evaluation of 8 advanced LLMs shows that increased exploration does not reliably translate to task progress or correct answers.Read source
Your take?
Summary generated by Claude — human-verified