Archives

May 2026

3148 articles

arXiv cs.AI·

"The Whole Is Greater Than the Sum of Its Parts": A Compatibility-Aware Multi-Teacher CoT Distillation Framework

COMPACT, a multi-teacher CoT distillation framework, adaptively fuses supervisions from multiple LLMs into compact student models. It dynamically weights teacher gradients using three metrics: graph-based consensus, mutual-information-based adaptability, and loss-based difficulty. Achieves SOTA results without catastrophic forgetting.

ReasoningFine-tuningPapers
SIG
72
HYP
25
arXiv cs.CL·

Reducing Credit Assignment Variance via Counterfactual Reasoning Paths

Researchers introduce IBPO (Implicit Behavior Policy Optimization), a credit assignment method for reinforcement learning with LLMs. By comparing multiple reasoning trajectories, the framework transforms sparse terminal rewards into step-sensitive learning signals, reducing gradient variance and improving stability on mathematical and code reasoning benchmarks.

Reinforcement learningReasoningCode generation
SIG
75
HYP
25
arXiv cs.CL·

Protection Is (Nearly) All You Need: Structural Protection Dominates Scoring in Globally Capped KV Eviction

Study of KV cache eviction policies (LRU, H2O, SnapKV, StreamingLLM, Ada-KV, QUEST, Random) under global cap. Without structural boundary protection, all collapse to F1≤0.064. Reserving 10% cache at each boundary recovers 69–90% quality on LongBench at C=256 (13% retention). Position-0 holds ~75% attention mass; protecting structurally critical tokens dominates over scoring differences.

ReasoningBenchmarksPapers
SIG
78
HYP
15
arXiv cs.AI·

SuReNav: Superpixel Graph-based Constraint Relaxation for Navigation in Over-constrained Environments

SuReNav proposes a superpixel graph-based navigation method for over-constrained environments. The system combines constraint map generation, relaxation via GNN trained on human demonstrations, and interleaved execution. Evaluated on 2D/3D OpenStreetMap maps and Spot quadruped robot, it achieves highest human-likeness score while balancing safety and efficiency.

AI AgentsRoboticsPapers
SIG
72
HYP
25
arXiv cs.AI·

Perception-based Image Denoising via Generative Compression

Paper proposes generative compression framework for perception-based image denoising. Two approaches: conditional WGAN-based denoiser explicitly controlling rate-distortion-perception trade-off, and conditional diffusion-based iterative reconstruction guided by compressed latents. Theoretical guarantees and perceptual improvements demonstrated on synthetic and real-noise benchmarks.

Image generationPapersBenchmarks
SIG
72
HYP
18
arXiv cs.AI·

No Plan, Yet Human: A Reactive Robotics Model Predicts Human Planning Failures on a Clinical Task

AICON, a reactive robotics model using gradient descent, better predicts human planning failures on the Tower of London cognitive test than planning baselines. Without lookahead, it reproduces the difficulty ordering of 24 problems and fails similarly to Parkinson's patients, suggesting reduced planning capacity shifts behavior toward reactive modes.

RoboticsReasoningPapers
SIG
72
HYP
15
arXiv cs.AI·

Strategic Over-Parameterization for Generalizable Low-Rank Adaptation

LoRA-Over improves parameter-efficient fine-tuning (PEFT) by enriching the optimization landscape during training via auxiliary over-parameterization, then collapsing this enrichment into standard LoRA structure at inference. Evaluated on GLUE, MT-Bench, GSM8K, and HumanEval with LLaMA 2-7B and 3.1-8B, the framework consistently outperforms vanilla LoRA with no additional inference cost.

Fine-tuningLlamaBenchmarks
SIG
78
HYP
18
arXiv cs.AI·

Conservative AI for Safety-Sensitive Medical Image Restoration: Residual-Bounded CT-CTA Enhancement for Intracranial Aneurysm-Relevant Signal Recovery

2.5D residual-bounded image restoration framework for enhancing intracranial CT/CTA without uncontrolled modification of clinically sensitive regions. Model adds learned residual via edit-control map limiting magnitude and spatial extent. On 50 out-of-distribution cases: PSNR 37.51 dB, iatrogenic-edit rate 4.0%, net positive in 85.4% of 1,000 Monte Carlo runs.

VisionAI safetyEvals
SIG
72
HYP
15
arXiv cs.AI·

Edge-AI-Driven Learning-to-Rank for Decentralized Task Allocation in Circular Smart Manufacturing

Decentralized task allocation framework for circular manufacturing using Edge-AI and ranking-aware learning. Each machine evaluates tasks using local information (processing capability, queue state, resource contention). Results: reduced delays, improved deadline adherence, enhanced energy efficiency in discrete-event simulation.

AI AgentsReinforcement learningInfrastructure
SIG
65
HYP
15
arXiv cs.AI·

Deep Reinforcement Learning Framework for Diversified Portfolio Management Across Global Equity Markets

Deep reinforcement learning framework for dynamic portfolio allocation across global equity markets. Soft Actor-Critic optimizes continuous weights with transaction costs and diversification constraints. Evaluation on Nasdaq-100, Nikkei 225, Euro Stoxx 50 (2003-2026): significant abnormal returns on Euro Stoxx 50, but no statistically significant outperformance vs Buy and Hold across all markets.

Reinforcement learningBenchmarksPapers
SIG
72
HYP
25
arXiv cs.AI·

Agentic Pipeline for Self-Synchronized Multiview Joint Angle Monitoring in Uncalibrated Environments

Agentic pipeline for multi-view joint angle monitoring without calibration in uncalibrated environments. Uses two cameras, automatic synchronization via multimodal LLM, 2D pose detection and agent-based selection to identify target subject. Validation against Vicon system: MAE 5.97° ± 2.36°, Pearson correlation 0.962 ± 0.014. Application: spinal cord injury rehabilitation.

AI AgentsVisionReasoning
SIG
72
HYP
18
arXiv cs.AI·

Overcoming the Intrinsic Performance Limitations of MEMS IMU via Diffusion-Based Generative Learning

A conditional diffusion model based on U-Net architecture synthesizes high-fidelity virtual IMU data from low-cost IMU measurements. Trained with high-grade IMU measurements as ground-truth priors, the model significantly improves positioning and attitude estimation accuracy, and produces thinner, more consistent point clouds in airborne mapping experiments.

VisionRobotics
SIG
72
HYP
28
arXiv cs.AI·

StreamPro: From Reactive Perception to Proactive Decision-Making in Streaming Video

StreamPro introduces StreamPro-Bench, a benchmark evaluating proactive video streaming understanding across three dimensions: perception, temporal reasoning, and proactive agency. The framework proposes CB-Stream Loss to address supervision imbalance and applies GRPO with multi-grained rewards. Results: 41.5 on StreamPro-Bench vs 10.4 previously, 78.9 on StreamingBench-RTVU.

VisionReasoningReinforcement learning
SIG
75
HYP
25
arXiv cs.AI·

LARGER: Lexically Anchored Repository Graph Exploration and Retrieval

LARGER is a context retrieval framework for repository-level coding agents combining lexical search with structural graph exploration (imports, call chains, type hierarchies) without external databases. On LocBench, it improves file-level Acc@5 by +13.9 points (or +11.8 with fixed hyperparameters) and shows consistent gains on test generation and codebase QA benchmarks.

AI AgentsCode generationBenchmarks
SIG
78
HYP
15
arXiv cs.AI·

Systematic Evaluation of Vision Transformers for Automated Cervical Cancer Classification: Optimization, Statistical Validation, and Clinical Interpretability

Systematic optimization of Vision Transformers (ViT-Tiny) for cervical cancer screening on Herlev dataset (917 images). Optimal configuration: 94.9%-95.2% cross-validation accuracy with horizontal flipping and class weighting (0.7 x 1.3). Grad-CAM validates clinical interpretability: attention on nuclei, cell boundaries, and chromatin texture.

VisionBenchmarksEvals
SIG
72
HYP
25
arXiv cs.AI·

Detecting Verbatim LLM Copy-Paste in Homework

SteganoPrompt, an open-source web tool, detects verbatim copies of assignment prompts submitted to LLMs. It encodes an invisible instruction in the prompt via the Unicode Tags block (U+E0000–U+E007F), creating a detectable signature in the model's response. Tested across 7 LLM families, the approach bypasses limitations of post-hoc detectors and requires no cooperation from model providers.

EvalsAI safetyPrompt engineering
SIG
75
HYP
15
arXiv cs.AI·

Phase Transitions in Driven Informational Systems: A Two-Field Perspective on Learning Theory and Non-Equilibrium Chemistry

Theoretical paper proposing a unified framework for phase transitions in deep learning (grokking, emergent capabilities) and non-equilibrium chemistry. Introduces two gradient fields (entropy production rate and information quasi-potential) and two order parameters (adversarial breakdown threshold α†, self-referential coupling threshold κc) to describe driven informational systems.

ReasoningAlignmentPapers
SIG
45
HYP
25
arXiv cs.AI·

AdaGraph: A Graph-Native Clustering Algorithm That Overcomes the Curse of Dimensionality and Enables Scientific Discovery

AdaGraph is a graph-native clustering algorithm that overcomes the curse of dimensionality by operating on kNN topology rather than Euclidean metrics. Without specifying k a priori, it identifies gene modules in genomics (GSE14520, 10k genes), achieves ARI=0.751 on text clustering (20NG-6cat vs HDBSCAN 0.464), and outperforms Silhouette/Davies-Bouldin on 10 benchmarks up to d=5000.

BenchmarksPapers
SIG
72
HYP
28
arXiv cs.AI·

When Actions Disappear: Adversarial Action Removal in Self-Play Reinforcement Learning

Study of adversarial attacks via action removal in self-play reinforcement learning. An attacker selectively removes legal actions from the victim's available set. Across poker games (6 to 5,531 states) and two non-poker domains, learned masking causes more damage than random masking. The attack persists across Q-learning, PPO, NFSP, DQN and shows no recovery under extended masked training.

Reinforcement learningAI safetyBenchmarks
SIG
72
HYP
18
arXiv cs.AI·

MusicSynth: An Automated Pipeline for Generating Violin Fingerboard Animations from Sheet Music Using Optical Music Recognition

MusicSynth is an open-source web tool that automatically converts violin sheet music (photo or file) into animated videos showing finger positioning on the fingerboard. The system combines optical music recognition (OMR), MusicXML parsing, and video rendering. Tested on 110 scores: 91.2% note recognition accuracy on printed music, 99.1% finger position accuracy on digital files.

VisionCode generationOpen source
SIG
72
HYP
25
arXiv cs.AI·

Task-Level AI Readiness Assessment for Business Process Management:The T-IPO Model and LARA Matrix in Financial-Services IT Operations

arXiv paper introducing T-IPO and LARA, tools to assess LLM agent readiness for business tasks. LARA is a 5-dimension rubric scoring tasks into 4 levels (L1-L4), with 1.5× weight on compliance sensitivity. Validated on 127 tasks (κ=0.80), replicated across 3 institutions (κ=0.73). Auto-completion decays from 95% (L1) to 40% (L3).

AI AgentsEvalsPapers
SIG
72
HYP
15
arXiv cs.AI·

AI4BayesCode: From Natural Language Descriptions to Validated Modular Stateful Bayesian Samplers

AI4BayesCode translates natural-language Bayesian model descriptions into validated, modular MCMC samplers. The system decomposes models into sampling blocks mapped to built-in components, with pre- and post-generation validation. A novel recursively stateful architecture enables coherent composition of independently developed sampling components.

Code generationAI AgentsReasoning
SIG
72
HYP
28
arXiv cs.AI·

Evolutionary Extreme Learning Machine of ab-initio Energy Landscapes for Crystal Structure Prediction using Manta Ray Optimization with Levy Flight

Manta Ray Foraging Optimization algorithm enhanced with Lévy Flight to train Extreme Learning Machines (ELMs) for predicting crystal formation energies. EELM-MRFO-LF uses MRFO-Lévy for input weight selection and Moore-Penrose generalized inverse for analytical output weight determination, improving population diversity and avoiding local optima.

BenchmarksPapers
SIG
35
HYP
15
arXiv cs.AI·

From Reactive to Proactive: A Multi-Regulatory Empirical Analysis of 480 AI Incidents and a Data-Driven Governance Compliance Framework

Analysis of 480 real-world AI incidents from AIID against EU AI Act, NIST AI Risk Management Framework, and GDPR post-deployment provisions. Reveals substantial governance gaps in post-deployment accountability. Proposes Proactive AI Governance Compliance Framework (PAGCF), a four-phase lifecycle methodology shifting from reactive incident response to pre-deployment compliance assurance.

RegulationAI safetyAlignment
SIG
72
HYP
18
arXiv cs.AI·

Harnessing AI for Inverse Partial Differential Equation Problems: Past, Present, and Prospects

Comprehensive review of AI methods for solving inverse partial differential equation (PDE) problems. Covers three categories: inverse problems, inverse design, and control. Applications: medical imaging, geophysics, aerodynamics, thermal systems. Challenges: physics-informed architectures, limited real-world data, uncertainty quantification, inverse foundation models.

PapersReasoningBenchmarks
SIG
65
HYP
25