Page 31 of 141

AllHigh signalRecent
5620 articles
arXiv cs.AI·

EmoMind: Decoding Affective Captions from Human Brain fMRI

EmoMind decodes affective captions directly from brain fMRI signals. The system first retrieves a neutral scene description from brain-decoded visual features, then rewrites it using a continuous 34-dimensional emotion vector extracted from the same fMRI recording. Evaluated on two independent emotion fMRI datasets, EmoMind outperforms GPT-4 with discrete emotion labels across all validation axes.

VisionReasoningEvals
SIG
75
HYP
25
arXiv cs.AI·

CheckSupport: A Local LLM-Powered Tool for Automated Manuscript Submission Checklist Selection and Completion

CheckSupport is an open-source system using locally-deployed LLMs to automate reporting checklist recommendation and completion for scientific manuscripts. Evaluated on peer-reviewed manuscripts, it achieves 90% accuracy for checklist recommendations and 88% for item-level completion, processing each manuscript in 12.5 seconds on CPU-only hardware.

LlamaPrompt engineeringEvals
SIG
75
HYP
15
arXiv cs.LG·

When Actions Disappear: Adversarial Action Removal in Self-Play Reinforcement Learning

Study of adversarial action removal attacks in self-play reinforcement learning. An attacker selectively masks legal actions from the victim's action set. Experiments on poker (6 to 5,531 states) and two non-poker domains: learned masking causes substantially more damage than random masking, persists across Q-learning/PPO/NFSP/DQN, transfers between agents, and is amplified by self-play.

Reinforcement learningAI safetyBenchmarks
SIG
75
HYP
15
arXiv cs.AI·

Beyond Imperfect Alternatives with Rulemapping: A Neuro-Symbolic Case Study on Online Hate Speech

Neuro-symbolic study comparing LLMs constrained by deterministic logic scaffolds (Rulemapping) versus unconstrained prompting for hate speech moderation under German Criminal Code (§130). Rulemapping achieves precision 0.80-0.86 and recall 0.82-0.89 versus 0.34-0.49 with unconstrained prompting, eliminating conflation of moral offense with legal illegality.

ReasoningAI safetyRegulation
SIG
75
HYP
15
arXiv cs.AI·

CounterRefine: Answer-Conditioned Counterevidence Retrieval for Inference-Time Knowledge Repair in Factual Question Answering

CounterRefine is a lightweight repair layer for RAG that treats the first answer as a hypothesis to test. The system issues answer-conditioned expansion queries to retrieve candidate-specific evidence, then applies a deterministically-validated KEEP/REVISE refinement step. On SimpleQA, it improves a matched one-pass RAG baseline by up to 5.8 correct-rate points.

RAGReasoningEvals
SIG
75
HYP
15
arXiv cs.AI·

The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models

An arXiv study challenges the assumption that Mixture of Experts models achieve domain specialization through sparse routing. The COMMITTEEAUDIT framework reveals a domain-invariant "Standing Committee"—a compact coalition of experts capturing most routing mass across domains, layers, and budgets. Peripheral experts handle domain-specific knowledge alone.

BenchmarksPapers
SIG
75
HYP
15
arXiv cs.AI·

Herding CATs: ALARA for Agent Harness Engineering in Portable Composable Multi-Agent Teams

Paper introducing CAT (Context-Agent-Tool), a data layer for managing multi-agent teams. Applies ALARA principle (as low as reasonably achievable exposure) to context. Evaluates 22 models (0.6B–35B parameters) on 115 practical tasks via npcsh, a CLI shell. ~2500 executions test file operations, web search, multi-step scripting, tool chaining, and inter-agent delegation.

Multi-agentAI AgentsTools
SIG
75
HYP
15