Back to feed
arXiv cs.AI·

Capturing LLM Capabilities via Evidence-Calibrated Query Clustering

Signal
72
Hype
18
In three linesECC, a query clustering algorithm, calibrates semantic embeddings through model comparisons to align surface semantics with latent LLM capabilities. Using a Bradley-Terry model, it improves capability ranking by 17.64 points over human-labeled baselines and 18.02 points over embedding-based baselines, with applications to query routing.
Read source
Your take?
EvalsBenchmarksReasoning

Summary generated by Claude — human-verified