Page 67 of 148

AllHigh signalRecent
5885 articles
arXiv cs.AI·

Experiment-as-Code Labs: A Declarative Stack for AI-Driven Scientific Discovery

Experiment-as-Code Labs proposes a paradigm where scientific experiments are encoded as declarative configurations compilable to instrument APIs. AI agents formulate hypotheses, a systems layer performs program analysis and orchestration, then experiments execute via physical equipment control. General-purpose stack independent of science domain, lab type, or instrument.

AI AgentsPapersReasoning
SIG
72
HYP
28
arXiv cs.AI·

Missing Old Logits in Asynchronous Agentic RL: Semantic Mismatch and Repair Methods for Off-Policy Correction

Asynchronous RL pipelines for LLM agents lose historical old logits required for PPO off-policy correction, entangling discrepancy repair with staleness correction. The paper proposes three acquisition strategies (snapshot, dedicated model, interruption) and a revised PPO-EWMA method to preserve decoupled correction semantics.

AI AgentsReinforcement learningReasoning
SIG
72
HYP
15
arXiv cs.AI·

How Wrong Can Your Counterfactual Be? Quantifying Confounding Bias for Continuous Treatments without a Control Group

Causal inference framework for financial stress testing in panel data with continuous treatment and no control group. Proposes closed-form confounding envelope parameterized by two sensitivity parameters, combines partial identification with importance-weighted conformal prediction. Shows standard predictive models remain causally biased on US unemployment data.

ReasoningBenchmarksPapers
SIG
72
HYP
15
arXiv cs.CL·

Effort as Ceiling, Not Dial: Reasoning Budget Does Not Modulate Cognitive Cost Alignment Between Humans and Large Reasoning Models

Large Reasoning Models generate traces aligned with human reaction times, but this alignment persists regardless of inference-time reasoning budget. Study across GPT-OSS-20B and GPT-OSS-120B: token allocation tracks human difficulty patterns and remains invariant across effort levels, suggesting cognitive cost alignment is crystallized at training time.

ReasoningBenchmarksPapers
SIG
72
HYP
15
arXiv cs.CL·

Taming "Zombie'' Agents: A Markov State-Aware Framework for Resilient Multi-Agent Evolution

AgentRevive introduces a Markov state-aware framework for resilient multi-agent LLM system evolution. Instead of aggressively pruning failing agents, the method uses soft state transitions (Active/Standby/Terminated) with a hallucination risk estimator. Results: outperforms baselines on general reasoning, domain-specific tasks, and hallucination challenges while reducing token consumption.

Multi-agentAI AgentsReasoning
SIG
72
HYP
25
arXiv cs.AI·

EmergentBridge: Improving Zero-Shot Cross-Modal Transfer in Unified Multimodal Embedding Models

EmergentBridge improves unified multimodal embedding models for unpaired modality pairs (audio↔depth, infrared↔audio). The method learns a mapping producing a 'noisy bridge anchor' and enforces alignment in the orthogonal subspace, preserving existing anchor-alignment structure. Results across 9 datasets: outperforms baselines on zero-shot classification and retrieval.

EmbeddingsVisionMulti-agent
SIG
72
HYP
18
arXiv cs.CL·

AMATA: Adaptive Multi-Agent Trajectory Alignment for Knowledge-Intensive Question Answering

AMATA is an adaptive multi-agent trajectory alignment framework for knowledge-intensive question answering. Six specialized agents collaboratively perform structured actions to improve factual consistency and reduce hallucinations. The system formalizes multi-agent collaboration as a trajectory preference alignment problem with intra-trajectory and inter-agent dependency learning.

AI AgentsMulti-agentReasoning
SIG
72
HYP
28