Archives

May 2026

3148 articles

arXiv cs.AI·

SIPO: Stabilized and Improved Preference Optimization for Aligning Diffusion Models

SIPO stabilizes diffusion model alignment to human preferences by addressing training instability and off-policy bias. The method introduces DPO-C&M to clip uninformative timesteps and applies timestep-aware importance reweighting. Experiments on SD1.5, SDXL, CogVideoX-2B/5B, and Wan2.1-1.3B demonstrate improvements over Diffusion-DPO.

Image generationVideo generationReinforcement learning
SIG
72
HYP
18
arXiv cs.LG·

When Actions Disappear: Adversarial Action Removal in Self-Play Reinforcement Learning

Study of adversarial action removal attacks in self-play reinforcement learning. An attacker selectively masks legal actions from the victim's action set. Experiments on poker (6 to 5,531 states) and two non-poker domains: learned masking causes substantially more damage than random masking, persists across Q-learning/PPO/NFSP/DQN, transfers between agents, and is amplified by self-play.

Reinforcement learningAI safetyBenchmarks
SIG
75
HYP
15
arXiv cs.LG·

Reducing Credit Assignment Variance via Counterfactual Reasoning Paths

Researchers introduce IBPO (Implicit Behavior Policy Optimization), a credit assignment method for reinforcement learning with LLMs. By comparing multiple reasoning trajectories, the framework transforms sparse terminal rewards into step-sensitive learning signals, reducing gradient variance and improving stability on mathematical and code reasoning benchmarks.

Reinforcement learningReasoningCode generation
SIG
78
HYP
25
arXiv cs.LG·

Mirror Descent-Type Algorithms for the Variational Inequality Problem with Functional Constraints

Mirror descent-type algorithms for variational inequality problems with functional constraints. Proposed methods alternate between productive and non-productive steps based on constraint values, with optimal convergence rates for bounded monotone operators and Lipschitz convex constraints. Applicable to GANs, reinforcement learning, and adversarial training.

Reinforcement learningPapersAlignment
SIG
72
HYP
15
arXiv cs.AI·

The Capability Paradox: How Smarter Auditors Make Multi-Agent Systems Less Secure

Study on multi-agent systems: 'semantic hijacking' attacks exploit agent confidence. Paradox identified: increasing Worker capability raises attack success rate from 18.4% to 63.9%. Mediation analysis reveals 'linguistic certainty' of stronger agents drives vulnerability. Proposed solution: heterogeneous ensemble verification reduces attack success rate to 2%.

Multi-agentAI AgentsAI safety
SIG
82
HYP
15
arXiv cs.CL·

Self-Distilled Trajectory-Aware Boltzmann Modeling: Bridging the Training-Inference Discrepancy in Diffusion Language Models

TABOM, a post-training method for Diffusion Language Models, aligns optimization with the multi-step easy-to-hard decoding trajectory observed at inference. Via Boltzmann modeling of unmasking preferences, it derives a tractable pairwise ranking objective that reduces training-inference discrepancy and improves performance on new domains.

Fine-tuningReasoningPapers
SIG
75
HYP
15
arXiv cs.CL·

NaviRAG: Towards Active Knowledge Navigation for Retrieval-Augmented Generation

NaviRAG introduces a RAG framework shifting from passive segment retrieval to active knowledge navigation. The system structures documents into semantic hierarchies and uses an LLM agent to iteratively navigate, identify information gaps, and retrieve content at appropriate granularity levels. Results show improved retrieval recall and QA performance on long-document benchmarks over conventional RAG.

RAGAI AgentsReasoning
SIG
72
HYP
28
arXiv cs.CL·

CounterRefine: Answer-Conditioned Counterevidence Retrieval for Inference-Time Knowledge Repair in Factual Question Answering

CounterRefine adds a lightweight repair layer for RAG: after an initial answer, the system issues answer-conditioned queries to retrieve candidate-specific counterevidence, then applies a deterministically-validated KEEP/REVISE refinement step. On SimpleQA, improves baseline by up to 5.8 correct-rate points; modifies 5.6% of outputs with 180 beneficial changes versus 8 harmful ones.

RAGReasoningEvals
SIG
78
HYP
15
arXiv cs.CL·

Embracing Anisotropy: Turning Massive Activations into Interpretable Control Knobs for Large Language Models

LLMs exhibit highly anisotropic internal representations with massive activations. Rather than treating them as artifacts, the authors identify them as interpretable functional units using a magnitude-based criterion. Steering applied to these critical dimensions outperforms conventional whole-dimension steering in domain adaptation and jailbreaking scenarios.

AI safety
SIG
72
HYP
18
arXiv cs.CL·

Helpful to a Fault: Measuring Illicit Assistance in Multi-Turn, Multilingual LLM Agents

STING is an automated red-teaming framework measuring multi-turn illicit assistance in LLM agents. It constructs step-by-step illicit plans grounded in benign personas and uses judge agents to track completion. Multilingual evaluation across six non-English languages shows attack success does not consistently increase in lower-resource languages, diverging from chatbot findings.

AI AgentsAI safetyEvals
SIG
78
HYP
25
arXiv cs.CL·

Large Language Models and Impossible Language Acquisition: "False Promise" or an Overturn of our Current Perspective towards AI

Experimental study testing Chomsky's critique of LLMs: GPT-2 small and LSTM trained on syntactically impossible languages (reversed sentences, parity-based negations). GPT-2 shows lower perplexity on natural language (loss ratios up to 2.25× on reversed conditions), LSTM minimal differences. Authors propose functionalist paradigm against Chomsky's rationalist perspective.

PapersReasoningBenchmarks
SIG
65
HYP
25
arXiv cs.CL·

"The Whole Is Greater Than the Sum of Its Parts": A Compatibility-Aware Multi-Teacher CoT Distillation Framework

COMPACT, a multi-teacher CoT distillation framework, adaptively fuses supervisions from multiple LLMs into compact student models. It dynamically weights teacher gradients using three metrics: graph-based consensus, mutual-information-based adaptability, and loss-based difficulty. Achieves SOTA results across benchmarks while mitigating catastrophic forgetting.

ReasoningFine-tuningPapers
SIG
72
HYP
25
arXiv cs.CL·

Rethinking Table Pruning in TableQA: From Sequential Revisions to Gold Trajectory-Supervised Parallel Search

TabTrim, a novel table pruning framework for TableQA, replaces sequential revisions with gold trajectory-supervised parallel search. The system uses intermediate sub-tables from gold SQL queries to train a pruner and verifier. TabTrim-8B achieves 73.5% average accuracy, outperforming strongest baselines by 3.2% (79.4% on WikiTQ, 61.2% on TableBench).

BenchmarksReasoningPapers
SIG
78
HYP
25
arXiv cs.CL·

QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation

QuCo-RAG proposes dynamic RAG grounded in pre-training corpus statistics rather than model-internal signals. It identifies low-frequency entities and verifies their co-occurrence in 4 trillion tokens using Infini-gram. On multi-hop QA benchmarks, it gains 5–12 EM points over baselines with OLMo-2, and up to 14 points on Llama-3, Qwen2.5, GPT-4.

RAGReasoningBenchmarks
SIG
78
HYP
18
arXiv cs.CL·

Dynamic Generation of Multi-LLM Agents Communication Topologies with Graph Diffusion Models

Guided Topology Diffusion (GTD) uses graph diffusion models to dynamically generate optimal communication topologies for multi-agent LLM systems. The iterative framework, guided by a proxy model predicting multi-objective rewards (accuracy, utility, cost), adapts topologies to tasks without gradient-based optimization, outperforming static approaches.

Multi-agentAI AgentsBenchmarks
SIG
75
HYP
25
arXiv cs.CL·

AdaSwitch: Adaptive Switching between Small and Large Agents for Effective Cloud-Local Collaborative Learning

AdaSwitch proposes a cloud-local collaborative paradigm where a local agent (small LLM) handles simple tasks and requests assistance from a cloud agent (large LLM) for complex reasoning. The adaptive mechanism detects local errors and dynamically switches. Evaluation on 7 benchmarks (mathematical reasoning, complex QA) shows performance improvement with reduced computational overhead.

AI AgentsMulti-agentReasoning
SIG
72
HYP
25
arXiv cs.AI·

The Loupe: A Plug-and-Play Attention Module for Amplifying Discriminative Features in Vision Transformers

The Loupe is a lightweight spatial gating module for hierarchical Vision Transformers designed for fine-grained visual classification. Inserted at an intermediate feature stage, it predicts a single-channel spatial mask via a small CNN and reweights activations. On CUB-200-2011, it improves Swin-Base from 88.36% to 91.72% and Swin-Tiny from 85.14% to 88.61% with <0.1% additional parameters.

VisionBenchmarks
SIG
72
HYP
18
arXiv cs.AI·

Toward Robust Multilingual Adaptation of LLMs for Low-Resource Languages

LiRA, a lightweight fine-tuning framework, improves multilingual LLM adaptation for low-resource languages. It combines Arca (anchor-based alignment to English) and LaSR (language-aware semantic head) to stabilize representations and cross-lingual consistency. Positive results on retrieval, ranking, QA, and reasoning. Multilingual dataset (7 Asian languages) and code released open-source.

Fine-tuningRAGEmbeddings
SIG
75
HYP
20
arXiv cs.AI·

PyHealth 2.0: A Comprehensive Open-Source Toolkit for Accessible and Reproducible Clinical Deep Learning

PyHealth 2.0 is an open-source clinical deep learning toolkit reducing barriers to medical AI research. It unifies 15+ datasets, 20+ clinical tasks, 25+ models, and 5+ interpretability methods in a single framework supporting signals, imaging, and electronic health records. Delivers 39x speedup and 20x memory reduction, with 400+ community members.

Open sourceCode generationEvals
SIG
78
HYP
25
arXiv cs.CL·

Overeager Coding Agents: Measuring Out-of-Scope Actions on Benign Tasks

OverEager-Gen is a benchmark measuring out-of-scope actions by autonomous coding agents on benign tasks. On Claude Code, removing the consent declaration raises the overeager rate from 0% to 17.1% (p=2.4×10⁻⁴). Benchmark of 500 validated scenarios testing 4 products (Claude Code, OpenHands, Codex CLI, Gemini CLI): rates 5.4–27.7% in permissive mode vs 0.2–4.5% in ask-to-continue framework.

AI AgentsCode generationAI safety
SIG
78
HYP
15
arXiv cs.LG·

ReTAMamba: Reliability-Aware Temporal Aggregation with Mamba for Irregular Clinical Time Series Prediction

ReTAMamba proposes a Mamba-based architecture for predicting irregular clinical time series. The model estimates observation reliability from missingness and elapsed time, integrates short/long-term information via Chronological Weaving, and uses a budgeted token router. On MIMIC-IV, eICU, and PhysioNet 2012, AUPRC gains of 7.51%, 7.80%, and 10.15% respectively.

BenchmarksReasoningPapers
SIG
78
HYP
15
arXiv cs.CL·

CodeBind: Decoupled Representation Learning for Multimodal Alignment with Unified Compositional Codebook

CodeBind introduces a multimodal alignment framework using shared-specific compositional codebooks. The method decomposes representations into semantic shared components and modality-unique components, validated across 9 modalities (text, image, video, audio, depth, thermal, tactile, 3D point cloud, EEG) achieving state-of-the-art performance in classification and retrieval tasks.

EmbeddingsVisionRobotics
SIG
72
HYP
25
arXiv cs.CL·

Scalable Environments Drive Generalizable Agents

Position paper arguing that generalizable agents require environment scaling—expanding the distribution of executable rule-sets agents interact with—beyond trajectory or task scaling within fixed benchmarks. Proposes unified taxonomy separating trajectory, task, and environment scaling; synthesizes construction paradigms (programmatic generators vs generative world models) for scalable environments.

AI AgentsReasoningBenchmarks
SIG
45
HYP
25
arXiv cs.AI·

The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models

An arXiv study challenges the assumption that Mixture of Experts models achieve domain specialization through sparse routing. The COMMITTEEAUDIT framework reveals a domain-invariant "Standing Committee"—a compact coalition of experts capturing most routing mass across domains, layers, and budgets. Peripheral experts handle domain-specific knowledge alone.

BenchmarksPapers
SIG
75
HYP
15
arXiv cs.AI·

Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

VideoDR is the first benchmark for open-domain video question answering, combining cross-frame visual extraction, iterative web retrieval, and multi-hop reasoning. Evaluation of multimodal models (closed/open-source) shows Agentic paradigm is not consistently superior to Workflow; key challenges are goal drift and long-horizon consistency.

AI AgentsVisionReasoning
SIG
72
HYP
28