Back to feed
arXiv cs.CL·

Model-Based Quality Assessment for Massively Multilingual Parallel Data

Signal
78
Hype
15
In three linesStudy of automatic assessment for massive multilingual bitext: decomposed into parallelism evaluation via multilingual embeddings and reference-free quality estimation. Benchmarks 4 embedding models and 9 evaluators on FLORES-200 covering 6,654 language-pair directions. Key finding: no single model is universally reliable; direction-aware routing and calibration required.
Read source
Your take?
BenchmarksEmbeddingsEvals

Summary generated by Claude — human-verified