Back to feed
arXiv cs.CL·

BacktestBench: Benchmarking Large Language Models for Automated Quantitative Strategy Backtesting

Signal
78
Hype
25
In three linesBacktestBench is the first large-scale benchmark for automated quantitative backtesting, containing 18,246 annotated QA pairs across 6 million real market records. AutoBacktest, a multi-agent system, translates natural language strategies into reproducible backtests via a Summarizer, SQL Retriever, and Python Coder. Evaluation on 23 mainstream LLMs.
Read source
Your take?
BenchmarksMulti-agentCode generationAI AgentsPapers

Summary generated by Claude — human-verified