Back to feed
arXiv cs.CL·

PaliBench: A Multi-Reference Blueprint for Classical Language Translation Benchmarks

Signal
72
Hype
15
In three linesPaliBench is a benchmark for Pali-to-English translation containing 1,700 passages (345,000 tokens) aligned with three independent reference translations. The method combines LLM-assisted alignment, automated verification, and multi-metric evaluation. Evaluation of ten contemporary LLMs shows strong cross-metric concordance but substantial variation in reliability.
Read source
Your take?
BenchmarksEvalsPapers

Summary generated by Claude — human-verified