Page 69 of 148

AllHigh signalRecent

5895 articles

Interaction-Breaking Adversarial Learning Framework for Robust Multi-Agent Reinforcement Learning

New IBAL method to strengthen MARL robustness against inter-agent interaction disruptions. Framework uses information-theoretic approach to construct attacks that degrade coordination by perturbing observations and actions, then trains agents to remain reliable. Demonstrated improvement over existing baselines and agent-missing scenarios.

Multi-agent Reinforcement learning

SIG

HYP

arXiv cs.AI·May 19

VolTA-3D: Self-Supervised Learning for Brain MRI using 3D Volumetric Token Alignment

VolTA-3D is a self-supervised 3D Vision Transformer framework for brain MRI. It aligns global and local tokens in a student-teacher paradigm and enforces fine-grained structural reconstruction. Evaluated on hippocampal segmentation and classification tasks (sex, Alzheimer's), it outperforms random baselines and demonstrates improved transferability across domain shifts.

Vision Papers

SIG

HYP

arXiv cs.AI·May 19

Distinguishable Deletion: Unifying Knowledge Erasure and Refusal for Large Language Model Unlearning

Distinguishable Deletion (D²) unifies knowledge deletion and refusal for LLM unlearning. The method uses an energy index to erase undesirable knowledge in latent representations rather than specific tokens, avoiding biased deletion and re-emergence of harmful content. Energy-based Unlearning Alignment (EUA) applies this mechanism at training and inference.

AI safety Alignment Papers

SIG

HYP

arXiv cs.AI·May 19

Cross-Domain Molecular Relational Learning: Leveraging Chemical Structure-Activity Analysis

DisTrans, a domain adversarial training network, optimizes cross-domain molecular relational learning by integrating topological structures and visual modalities. Using gradient reversal and semantic alignment of functional groups, the method outperforms 16 baselines across two cross-domain strategies.

Papers Benchmarks Vision

SIG

HYP

arXiv cs.AI·May 19

Observation-Aligned Mask Priors for Learning Physical Dynamics from Authentic Occlusions

A framework learns authentic occlusion mask distributions using Bayesian Flow Networks to train diffusion-based reconstruction models on incomplete observations. Tested on oceanographic satellite data (256×256), it improves MSE and PSNR over diffusion baselines by preventing zero-query dead zones.

Papers Benchmarks Vision

SIG

HYP

arXiv cs.AI·May 19

Echoes in Filter Bubble: Diagnosing and Curing Popularity Bias in Generative Recommenders

Study on popularity bias in Generative Recommenders (GRs). Authors identify bias stems from token-level optimization flaw and undifferentiated item tokenization. They propose Ghost, a GR with asymmetric unlikelihood optimization and skeleton-founded tokenization, validated across 3 datasets.

Papers Benchmarks Alignment

SIG

HYP

arXiv cs.AI·May 19

Learning Relative Representations for Fine-Grained Multimodal Alignment with Limited Data

Post-hoc multimodal alignment method using relative representations at token level to match separately pre-trained encoders with limited paired data. Learns learnable anchors in each modality space to induce consistent cross-modal similarity patterns. Outperforms existing methods on zero-shot classification, cross-modal retrieval, and zero-shot segmentation.

Embeddings Vision RAG

SIG

HYP

arXiv cs.AI·May 19

Pedestrian-Aware LLM-Driven Behavioral Planning for Autonomous Vehicles

LLM-based behavioral planning framework for autonomous vehicles to anticipate pedestrian behavior. Evaluated on SUMO: 68% collision-free success rate zero-shot (vs 17.7% deep RL), 96% with few-shot episodic memory. Interpretable decisions with cross-behavior transfer across scenarios.

Reasoning Reinforcement learning AI safety

SIG

HYP

arXiv cs.AI·May 19

GRID: Graph Representation of Intelligence Data for Security Text Knowledge Graph Construction

GRID is an end-to-end framework for constructing security knowledge graphs from cyber threat intelligence articles. Using Qwen3-4B-Instruct, it combines graph extraction, text revision, and a task bank (multi-choice questions + regex) to generate stable rewards. On 249 CTI articles, the Task-bank Reward model achieves 84.62% precision, 64.91% recall, and 68.53% Avg F1.

Reinforcement learning Benchmarks

SIG

HYP

arXiv cs.AI·May 19

The Lattice Representation Hypothesis of Large Language Models

A hypothesis proposes that LLMs encode concept lattices in their embedding geometry. The framework unifies the Linear Representation Hypothesis with Formal Concept Analysis (FCA), showing that linear attribute directions induce lattices via half-space intersections. Experiments on WordNet validate that embeddings capture logical and hierarchical structures.

Reasoning Papers Embeddings

SIG

HYP

arXiv cs.AI·May 19

Voices in the Loop: Mapping Participatory AI

Study of an open-source interactive atlas mapping 200+ participatory AI initiatives. Reproducible protocol for discovery, vetting, and harmonization of cases. Findings: initiatives concentrated in few countries, participation mostly in problem formulation and evaluation, rarely in model development.

AI safety Regulation

SIG

HYP

arXiv cs.AI·May 19

Sketch Then Paint: Hierarchical Reinforcement Learning for Diffusion Multi-Modal Large Language Models

HT-GRPO, a hierarchical reinforcement learning method for diffusion multi-modal models, organizes optimization into three stages (global, structure, refinement). It solves multiple unmasking sequences and assigns differentiated rewards based on token importance. Tests on MMaDA and Lumina-DiMOO show gains on GenEval and DPG benchmarks.

Reinforcement learning Image generation Benchmarks

SIG

HYP

arXiv cs.AI·May 19

NGM: A Plug-and-Play Training-Free Memory Module for LLMs

NGM is a training-free memory module for LLMs using a Causal N-Gram Encoder and Cosine-Gated Memory Injector. Tested on Qwen3 (0.6B-14B), it improves average performance by 0.5-1.2 points, with notable gains on code generation (+3.0 LiveCodeBench) and knowledge-intensive tasks (+3.03 GPQA).

Qwen Code generation Reasoning

SIG

HYP

arXiv cs.AI·May 19

How do Humans Process AI-generated Hallucination Contents: a Neuroimaging Study

EEG study of 27 participants analyzing neural mechanisms for detecting AI hallucinations. Researchers recorded brain activity during verification of image descriptions generated by an MLLM. Results show that misjudged hallucinations fail to trigger standard fact-verification neural pathways.

Vision AI safety Alignment

SIG

HYP

arXiv cs.AI·May 19

Prefix-Adaptive Block Diffusion for Efficient Document Recognition

PA-BDM improves Block Diffusion Models for document recognition by replacing bidirectional denoising with causal prefix-to-suffix denoising. Using Confidence-gated Structural Loss and Progressive Prefix Commitment, the 3B model achieves 71.6% higher inference throughput than MinerU-Diffusion 2.5B.

Papers Code generation Benchmarks

SIG

HYP

arXiv cs.AI·May 19

DriveSafe: A Framework for Risk Detection and Safety Suggestions in Driving Scenarios

DriveSafe is a framework for risk detection in autonomous driving scenarios. It generates spatially grounded captions enriched with motion and depth cues, then assesses risks using a fine-tuned adapter module on caption-risk pairs. Achieves SOTA on DRAMA benchmark.

Vision Reasoning AI safety

SIG

HYP

arXiv cs.AI·May 19

Towards Human-Level Book-Writing Capability

Researchers present a framework for book-scale creative writing. Starting from public-domain novels, they build a multi-resolution scaffold (summary → chapters → scenes → full text) and train a long-context model on prompt-to-book trajectories. Goal: generate human literary prose rather than generic assistant-style text.

Fine-tuning Reasoning Code generation

SIG

HYP

arXiv cs.AI·May 19

Effort as Ceiling, Not Dial: Reasoning Budget Does Not Modulate Cognitive Cost Alignment Between Humans and Large Reasoning Models

Large Reasoning Models generate traces aligned with human reaction times, but this alignment persists regardless of inference-time reasoning budget. Study across GPT-OSS-20B and GPT-OSS-120B: three effort levels, six cognitive tasks. Token allocation tracks fine-grained human difficulty patterns and reflects a structure crystallized at training time, not modulated in real-time.

Reasoning Benchmarks Papers

SIG

HYP

arXiv cs.AI·May 19

RAGA: Reading-And-Graph-building-Agent for Autonomous Knowledge Graph Construction and Retrieval-Augmented Generation

RAGA is an LLM-based autonomous agent for knowledge graph construction and retrieval-augmented generation. It combines CRUD operations, a ReAct loop with Read-Search-Verify-Construct constraint, and KG-vector synchronization for hybrid retrieval. QASPER experiments show gains in answer and evidence quality.

AI Agents RAG Reasoning

SIG

HYP

arXiv cs.AI·May 19

Scientific Logicality Enriched Methodology for LLM Reasoning: A Practice in Physics

Systematic investigation of logicality in LLM scientific reasoning. Authors develop a logicality-enriched methodology with assessment criteria and data sampling methods for logicality-guided training. Experiments on three backbone LLMs using physics problems extracted from academic literature. Code released.

Reasoning Fine-tuning Papers

SIG

HYP

arXiv cs.AI·May 19

Capturing LLM Capabilities via Evidence-Calibrated Query Clustering

ECC, a query clustering algorithm, calibrates semantic embeddings through model comparisons to align surface semantics with latent LLM capabilities. Using a Bradley-Terry model, it improves capability ranking by 17.64 points over human-labeled baselines and 18.02 points over embedding-based baselines, with applications to query routing.

Evals Benchmarks Reasoning

SIG

HYP

arXiv cs.AI·May 19

Latent Heuristic Search: Continuous Optimization for Automated Algorithm Design

Automated heuristic discovery via continuous optimization in latent space. Encoder maps discrete programs to continuous embeddings, differentiable surrogate model predicts performance, invertible normalizing flow regularizes optimization trajectory. Evaluation on TSP, CVRP, KSP, and Online Bin Packing shows competitive results against evolutionary baselines.

AI Agents Reasoning Benchmarks

SIG

HYP

arXiv cs.AI·May 19

Dynamics of collective creativity in AI art competitions

Analysis of 130,882 images from 368 remix parties on Artbreeder (13 months). Images converged toward common thematic attractors (steampunk, alien architecture) while becoming simpler. Paradox: more novel parents produced more complex, liked children, yet users preferred remixing less novel images.

Image generation Papers Evals

SIG

HYP

arXiv cs.AI·May 19

From Imitation to Interaction: Mastering Game of Schnapsen with Shallow Reinforcement Learning

Shallow neural network agents master the card game Schnapsen through reinforcement learning. RLBot, trained via asynchronous Monte Carlo updates, outperforms MLPBot (supervised imitation) and achieves statistically significant wins against RdeepBot, a search-based baseline. Combining learned value functions with deeper lookahead during gameplay improves performance.

Reinforcement learning Benchmarks Papers

SIG

HYP

arXiv cs.AI·May 19

CAREBench: Evaluating LLMs' Emotion Understanding by Assessing Cognitive Appraisal Reasoning

CAREBench is a benchmark evaluating LLMs' emotion understanding through cognitive appraisal reasoning. Tested on 6 models with complete inferential chain annotations (first/third-person perspectives), it shows stronger models match humans on some tasks but fall short on appraisal reasoning and positive emotion recognition.

Benchmarks Evals Reasoning

SIG

HYP

arXiv cs.AI·May 19

CatalyticMLLM: A Graph-Text Multimodal Large Language Model for Catalytic Materials

QE-Catalytic-V2 is a unified graph-text multimodal LLM for catalytic materials. It integrates property prediction and inverse design in a single shared representation space, eliminating distribution shifts between decoupled models. Demonstrates superior performance on relaxed-energy prediction and inverse design tasks.

Papers Benchmarks Vision

SIG

HYP

arXiv cs.AI·May 19

Response-free item difficulty modelling for multiple-choice items with fine-tuned transformers: Component-wise representation and multi-task learning

Response-free item difficulty modelling for multiple-choice questions using fine-tuned transformers. End-to-end approach on item wording eliminates manual feature engineering. Multi-task variant with auxiliary QA objective delivers significant improvements in small-sample regimes.

Fine-tuning Benchmarks

SIG

HYP

arXiv cs.AI·May 19

Reasoning Before Diagnosis: Physician-Inspired Structured Thinking for ECG Classification

CardioThink, a physician-inspired MLLM framework, structures ECG diagnosis through explicit reasoning stages (rhythm, conduction, morphology, impression) to enhance interpretability. Structured Set Policy Optimization (SSPO) aligns clinical reasoning without manual annotations, outperforming direct prediction approaches across ECG benchmarks.

Reasoning Vision Reinforcement learning

SIG

HYP

arXiv cs.AI·May 19

QQJ: Quantifying Qualitative Judgment for Scalable and Human-Aligned Evaluation of Generative AI

QQJ is an evaluation framework for generative AI combining expert-designed multi-dimensional rubrics and LLM evaluator calibration on small high-quality annotation sets. Tested on text and image generation, QQJ shows stronger alignment with human judgment than traditional automatic metrics and unconstrained LLM-based evaluators.

Evals Benchmarks Alignment

SIG

HYP

arXiv cs.AI·May 19

Multi-Party Multi-Objective Optimization as Consensus Search: Runtime Analysis of Cross-Party Recombination

Theoretical study of multi-objective evolutionary algorithms for multi-party optimization (MPMOP). On MP-JCG benchmark, payoff-guided mutation requires Θ(n²) fitness evaluations to cross a gap region, while CPR-NSGA-II achieves O(n log n) via cross-party recombination. Runtime analysis on BPBOMST (multi-party minimum spanning tree) with instance-parameterized bounds.

Multi-agent Benchmarks Papers

SIG

HYP

arXiv cs.AI·May 19

NeuSymMS: A Hybrid Neuro-Symbolic Memory System for Persistent, Self-Curating LLM Agents

NeuSymMS is a hybrid neuro-symbolic memory system for LLM agents. It couples neural fact extraction from dialogue with a CLIPS-based expert system that classifies, deduplicates, and reconciles facts. Knowledge is stored as subject-relation-value triples in a relational database, with short/long-term memory and access-based promotion.

AI Agents RAG Reasoning

SIG

HYP

arXiv cs.AI·May 19

Multimodal Cultural Heritage Knowledge Graph Extension with Language and Vision Models

Novel approach to extend Knowledge Graphs for French cultural heritage. Authors introduce WJoconde, a multimodal KG integrating text and images, with three variants and a benchmark for Knowledge Graph Completion. They propose a framework combining LLMs and Vision-Language Models for automated data extraction and validation, improving KG reliability.

Vision RAG Benchmarks

SIG

HYP

arXiv cs.AI·May 19

Divergence-Suppressing Couplings for Rectified Flow

Authors identify that trajectory entanglement in Rectified Flow stems from nonzero divergence regions in the learned velocity field. They propose an offline correction that attenuates the divergent component during coupling generation, with no deployment overhead. Improvements validated on 2D benchmarks and image generation.

Image generation Papers Benchmarks

SIG

HYP

arXiv cs.AI·May 19

LLM-Guided Communication for Cooperative Multi-Agent Reinforcement Learning

LMAC leverages LLM reasoning to design communication protocols in MARL, enabling agents to reconstruct the underlying state uniformly and accurately. The approach iteratively refines protocols using an explicit state-awareness criterion. Experiments on MARL benchmarks demonstrate substantial performance gains over prior baselines.

Multi-agent Reinforcement learning Reasoning

SIG

HYP

arXiv cs.AI·May 19

LAST-RAG: Literature-Anchored Stochastic Trajectory Retrieval-Augmented Generation for Knowledge-Conditioned Degradation Model Selection

LAST-RAG proposes a method for selecting stochastic degradation models to estimate remaining useful life (RUL). It combines observed trajectories and domain context via retrieval from a local evidence bank, with RCRUS mechanism to prevent premature model elimination. Experiments show outperformance versus statistical and prognostic baselines.

RAG Reasoning Benchmarks

SIG

HYP

arXiv cs.AI·May 19

Visualizing the Invisible: Generative Visual Grounding Empowers Universal EEG Understanding in MLLMs

GVG (Generative Visual Grounding) uses an EEG-to-image generative model to translate brain activity into visual images, bypassing text-only alignment. Tested on GVG-X-Omni (170M tuned params) and GVG-Janus (trimodal), the framework improves EEG understanding and visual generation by leveraging MLLMs' visual priors.

Vision Multi-agent Embeddings

SIG

HYP

arXiv cs.AI·May 19

Efficient Lookahead Encoding and Abstracted Width for Learning General Policies in Classical Planning

New approach for learning generalized policies in classical planning using Relational Graph Neural Networks (R-GNNs). Authors introduce efficient lookahead search encoding and relational abstraction to improve scalability on IPC 2023 benchmark. Results outperform classical planner LAMA.

Reasoning Benchmarks Papers

SIG

HYP

arXiv cs.AI·May 19

New Insight of Variance reduce in Zero-Order Hard-Thresholding: Mitigating Gradient Error and Expansivity Contradictions

New zeroth-order hard-thresholding algorithm with variance reduction for ℓ0-constrained optimization. Addresses SZOHT's limitation on random directions by mitigating conflict between ZO gradient deviation and hard-thresholding expansivity. Improved convergence rates validated on ridge regression and black-box adversarial attacks.

Reinforcement learning

SIG

HYP

arXiv cs.AI·May 19

DocOS: Towards Proactive Document-Guided Actions in GUI Agents

DocOS is a benchmark evaluating GUI agents capable of proactively searching online documentation to solve long-tailed tasks. Experiments reveal two bottlenecks: difficulty reliably locating relevant information and faithfully grounding retrieved instructions into precise GUI actions.

AI Agents Benchmarks Reasoning

SIG

HYP

arXiv cs.AI·May 19

Learning to Solve Compositional Geometry Routing Problems

Study of Compositional Geometry Routing Problem (CGRP), a generalization of routing problems covering points, lines, areas, and hybrid geometries. Proposes DiCon, a solver with differential attention and contrastive learning to handle asymmetry and enlarged action spaces. Results show strong performance, versatility, and superior generalization across diverse instances.

Papers Reasoning

SIG

HYP