Back to feed
arXiv cs.CL·

Hubness, Not Anisotropy, Drives Cross-Lingual Retrieval Asymmetry in Multilingual Embedding Models

Signal
82
Hype
15
In three linesStudy on cross-lingual retrieval asymmetry in 5 multilingual models (Gemini, Mistral, OpenAI, Qwen). Analysis of 6,518 idiomatic expressions in English, Bengali, Hindi, Arabic. Finding: hubness (vector concentration) is the dominant causal driver (49.5% dominance share), far exceeding anisotropy. CSLS correction closes 63.5% of reciprocity gap.
Read source
Your take?
EmbeddingsBenchmarksMulti-agentPapers

Summary generated by Claude — human-verified