Back to feed
arXiv cs.CL·

Discovering Lexical Gaps Using Embeddings from Multilingual LLMs

Signal
72
Hype
15
In three linesAutomated framework to detect lexical gaps (words absent in certain languages) using embeddings from multilingual LLMs. On Korean-English translation pairs, 4000 embedding spaces show gap words have weaker cross-lingual semantic alignment. Logistic classifiers achieve AUC 0.81–0.76 and retrieve 18/19 and 26/27 gap words.
Read source
Your take?
EmbeddingsBenchmarksPapers

Summary generated by Claude — human-verified