Back to feed
arXiv cs.AI·

BacktestBench: Benchmarking Large Language Models for Automated Quantitative Strategy Backtesting

Signal
78
Hype
25
In three linesBacktestBench is the first large-scale benchmark for automated quantitative backtesting, containing 18,246 annotated QA pairs from 6 million real market records. AutoBacktest, a multi-agent system, translates natural language strategies into reproducible backtests via Summarizer-Retriever-Coder coordination. Evaluation on 23 LLMs identifies key performance factors.
Read source
Your take?
AI AgentsMulti-agentCode generationBenchmarksPapers

Summary generated by Claude — human-verified