Improving BM25 Code Retrieval Under Fixed Generic Tokenization: Adaptive q-Log Odds as a Drop-In BM25 Fix
Signal
72
Hype
15
In three linesBM25 improvement for code retrieval using q-logarithmic transformation of RSJ-odds IDF. On CoIR CodeSearchNet Go, NDCG@10 rises from 0.2575 to 0.4874 (+89.3%). Drop-in fix with no latency cost, parameterized by corpus hapax density.Read source
Your take?
Summary generated by Claude — human-verified