Page 68 of 148

AllHigh signalRecent
5891 articles
arXiv cs.AI·

Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

VideoDR is the first benchmark for open-domain video question answering, combining cross-frame visual extraction, iterative web retrieval, and multi-hop reasoning. Evaluation of multimodal models (closed/open-source) shows Agentic paradigm is not consistently superior to Workflow; key challenges are goal drift and long-horizon consistency.

AI AgentsVisionReasoning
SIG
72
HYP
28
arXiv cs.AI·

Missing Old Logits in Asynchronous Agentic RL: Semantic Mismatch and Repair Methods for Off-Policy Correction

Asynchronous RL pipelines for LLM agents lose historical old logits required for PPO off-policy correction, entangling discrepancy repair with staleness correction. The paper proposes three acquisition strategies (snapshot, dedicated model, interruption) and a revised PPO-EWMA method to preserve decoupled correction semantics.

AI AgentsReinforcement learningReasoning
SIG
72
HYP
15
arXiv cs.AI·

SIPO: Stabilized and Improved Preference Optimization for Aligning Diffusion Models

SIPO stabilizes diffusion model alignment to human preferences by addressing training instability and off-policy bias. The method introduces DPO-C&M to clip uninformative timesteps and applies timestep-aware importance reweighting. Experiments on SD1.5, SDXL, CogVideoX-2B/5B, and Wan2.1-1.3B demonstrate improvements over Diffusion-DPO.

Image generationVideo generationReinforcement learning
SIG
72
HYP
18
arXiv cs.AI·

Mitigating Extrinsic Gender Bias for Bangla Classification Tasks

Study on extrinsic gender bias in Bangla pretrained language models. Four manually annotated task-specific datasets constructed (sentiment analysis, toxicity detection, hate speech, sarcasm detection) with minimal-pair gender perturbations. RandSymKL debiasing strategy proposed, combining symmetric KL divergence and cross-entropy loss, reducing bias while maintaining competitive accuracy.

BenchmarksAI safetyAlignment
SIG
72
HYP
15
arXiv cs.CL·

Taming "Zombie'' Agents: A Markov State-Aware Framework for Resilient Multi-Agent Evolution

AgentRevive introduces a Markov state-aware framework for resilient multi-agent LLM system evolution. Instead of aggressively pruning failing agents, the method uses soft state transitions (Active/Standby/Terminated) with a hallucination risk estimator. Results: outperforms baselines on general reasoning, domain-specific tasks, and hallucination challenges while reducing token consumption.

Multi-agentAI AgentsReasoning
SIG
72
HYP
25
arXiv cs.AI·

Agents for Experiments, Experiments for Agents: A Design Grammar for AI-Enabled Experimental Science

SEED is a framework representing experimental conditions as typed actor-flow graphs to study multi-agent systems and human-AI workflows. It enables describing conditions, evaluating structural novelty, and generating candidate designs under constraints. Empirical test on medical-triage task shows SEED-guided designs provide clearer interaction changes, assumptions, and governance checks.

AI AgentsMulti-agentEvals
SIG
72
HYP
18
arXiv cs.AI·

Domain Incremental Learning for Pandemic-Resilient Chest X-Ray Analysis

Replay-based continual learning method for adapting pneumonia detection models across clinical domain variations without catastrophic forgetting. Incorporates class-aware balanced replay and dynamically reweighted class-imbalance loss. Achieves 88.66% accuracy on PneumoniaMNIST with 5 simulated domains, outperforming Experience Replay and Fine-Tuning baselines.

Reinforcement learningVisionBenchmarks
SIG
72
HYP
15