Reddit r/LocalLLaMA·19 May 2026

Number-aware embeddings

Signal

Hype

In three linesA researcher developed number-aware embeddings by modifying an MLM architecture (ModernBERT). After 6 hours of H100 training, the model achieves 59% accuracy on triplet sorting vs 38% for ModernBERT and 34% for BGE-base-v1.5. The technique uses log-magnitude representation with 128 bins and a classification-regression head.

Read source

Your take?

Embeddings Fine-tuning Open source

Summary generated by Claude — human-verified

Number-aware embeddings

Other angles on this story