Back to feed
arXiv cs.CL·

From Documents to Segments: A Contextual Reformulation for Topic Assignment

Signal
72
Hype
18
In three linesNovel topic modeling approach (SBTA) assigns topics to text segments rather than entire documents, reducing topic contamination. Authors construct SemEval-STM, a dataset annotated via LLM + human refinement, and validate improved clustering quality and interpretability across multiple models.
Read source
Your take?
PapersBenchmarksRAG

Summary generated by Claude — human-verified