Page 72 of 148

AllHigh signalRecent
5899 articles
arXiv cs.AI·

Action-Gradient Monte Carlo Tree Search for Non-Parametric Continuous (PO)MDPs

Action-Gradient MCTS (AGMCTS) combines global tree search with local gradient-based action refinement for online planning in continuous spaces. Three theoretical contributions: action score gradient theorem, Multiple Importance Sampling Tree for sample reuse, tractable gradients via Area Formula. Outperforms state-of-the-art sample-based solvers on continuous MDP/POMDP benchmarks.

ReasoningReinforcement learningPapers
SIG
72
HYP
18
arXiv cs.AI·

Unveiling Memorization-Generalization Coexistence: A Case Study on Arithmetic Tasks with Label Noise

Study of memorization-generalization coexistence in over-parameterized neural networks. With 80% label noise on arithmetic tasks, models memorize noisy labels while maintaining an internal generalization structure. Frequency-based extraction achieves near-perfect accuracy. Task-agnostic partitioning into generalization/memorization components proposed.

PapersEvalsAlignment
SIG
72
HYP
15
arXiv cs.CL·

RAGA: Reading-And-Graph-building-Agent for Autonomous Knowledge Graph Construction and Retrieval-Augmented Generation

RAGA is an LLM-based autonomous agent for knowledge graph construction and retrieval-augmented generation. It replaces stateless batch pipelines with a ReAct loop supporting full CRUD operations, hybrid KG-vector synchronization, and evidence-anchored verification linked to source text. Experiments on QASPER show measurable gains in answer and evidence quality.

AI AgentsRAGReasoning
SIG
72
HYP
28
arXiv cs.AI·

An Empirical Study of Privacy Leakage Chains via Prompt Injection in Black-Box Chatbot Environments

Empirical study of privacy-leakage chains via prompt injection in black-box chatbot environments. Researchers analyze how attackers can hijack LLM agent tasks by injecting malicious content into external sources. They introduce the 'exemplification' technique and demonstrate a functional data-exfiltration chain combining prompt injection, jailbreaking, and web-tool invocation.

AI AgentsPrompt engineeringAI safety
SIG
72
HYP
25
arXiv cs.AI·

Peak-Detector: Explainable Peak Detection via Instruction-Tuned Large Language Models in Physiological Sign

Peak-Detector leverages instruction-tuned LLMs for peak detection across physiological signals (ECG, PPG, BCG, BSG) with explainability. A "peak-representation" technique compresses time-series while preserving critical events. The model is optimized via supervised fine-tuning then multi-objective reinforcement learning, evaluated on 7 datasets (6 public benchmarks + 1 real-world cohort).

ReasoningFine-tuningReinforcement learning
SIG
72
HYP
25