Page 32 of 142

AllHigh signalRecent
5648 articles
arXiv cs.AI·

HINT-SD: Targeted Hindsight Self-Distillation for Long-Horizon Agents

HINT-SD proposes targeted self-distillation for training long-horizon LLM agents. The method uses full-trajectory hindsight to identify failure-relevant actions and applies feedback-conditioned distillation only on targeted action spans. On BFCL v3 and AppWorld, it improves over dense per-turn feedback baselines by up to 18.80% while achieving 2.26× lower time per training step.

AI AgentsReinforcement learningReasoning
SIG
75
HYP
15
arXiv cs.AI·

StreamPro: From Reactive Perception to Proactive Decision-Making in Streaming Video

StreamPro introduces StreamPro-Bench, a benchmark evaluating proactive video streaming understanding across three dimensions: perception, temporal reasoning, and proactive agency. The framework proposes CB-Stream Loss to address supervision imbalance and applies GRPO with multi-grained rewards. Results: 41.5 on StreamPro-Bench vs 10.4 previously, 78.9 on StreamingBench-RTVU.

VisionReasoningReinforcement learning
SIG
75
HYP
25
arXiv cs.CL·

Self-Distilled Trajectory-Aware Boltzmann Modeling: Bridging the Training-Inference Discrepancy in Diffusion Language Models

TABOM, a post-training method for Diffusion Language Models, aligns optimization with the multi-step easy-to-hard decoding trajectory observed at inference. Via Boltzmann modeling of unmasking preferences, it derives a tractable pairwise ranking objective that reduces training-inference discrepancy and improves performance on new domains.

Fine-tuningReasoningPapers
SIG
75
HYP
15
arXiv cs.AI·

Detecting Verbatim LLM Copy-Paste in Homework

SteganoPrompt, an open-source web tool, detects verbatim copies of assignment prompts submitted to LLMs. It encodes an invisible instruction in the prompt via the Unicode Tags block (U+E0000–U+E007F), creating a detectable signature in the model's response. Tested across 7 LLM families, the approach bypasses limitations of post-hoc detectors and requires no cooperation from model providers.

EvalsAI safetyPrompt engineering
SIG
75
HYP
15
arXiv cs.AI·

Beyond Accuracy: Robustness, Interpretability and Expressiveness of EEG Foundation Models

Comparative study of 6 EEG foundation models across 8 datasets beyond clean accuracy. Robustness analysis (noise, channel dropout), interpretability via Attention-Aware Layer-Wise Relevance Propagation, and expressiveness through block-wise probing. Findings: no single model dominates all failure modes; models focus on task-appropriate brain regions but decode corrupted content poorly.

BenchmarksEvalsAI safety
SIG
75
HYP
15