Page 17 of 138

AllHigh signalRecent
5498 articles
arXiv cs.LG·

ReTAMamba: Reliability-Aware Temporal Aggregation with Mamba for Irregular Clinical Time Series Prediction

ReTAMamba proposes a Mamba-based architecture for predicting irregular clinical time series. The model estimates observation reliability from missingness and elapsed time, integrates short/long-term information via Chronological Weaving, and uses a budgeted token router. On MIMIC-IV, eICU, and PhysioNet 2012, AUPRC gains of 7.51%, 7.80%, and 10.15% respectively.

BenchmarksReasoningPapers
SIG
78
HYP
15
arXiv cs.AI·

DSPR: Dual-Stream Physics-Residual Networks for Trustworthy Industrial Time Series Forecasting

DSPR (Dual-Stream Physics-Residual Networks) proposes a forecasting framework that decouples stable temporal patterns from regime-dependent residual dynamics in industrial time series. Using an Adaptive Window module and Physics-Guided Dynamic Graph, it achieves 99% Mean Conservation Accuracy and 97.2% Total Variation Ratio across four industrial benchmarks.

BenchmarksReasoningInfrastructure
SIG
78
HYP
25
arXiv cs.AI·

Attractor-Vascular Coupling Theory: Formal Grounding and Empirical Validation for AAMI-Standard Cuffless Blood Pressure Estimation from Smartphone Photoplethysmography

Attractor-Vascular Coupling Theory (AVCT): mathematical framework showing cardiac attractor geometry encodes blood pressure information. Calibrated LightGBM model on smartphone PPG achieves MAE 2.05 mmHg (SBP) and 1.67 mmHg (DBP) in strict leave-one-subject-out cross-validation (46 subjects, 29,684 windows), meeting AAMI/IEEE SP10 criteria. PPG-only ablation matches ECG+PPG within 0.05 mmHg.

PapersBenchmarksEvals
SIG
78
HYP
15
arXiv cs.CL·

BELIEF: Structured Evidence Modeling and Uncertainty-Aware Fusion for Biomedical Question Answering

BELIEF combines structured evidence modeling and uncertainty-aware fusion for biomedical question answering. The framework converts retrieved documents into evidence objects (clinical attributes, source quality, relevance, support strength) and fuses two reasoning paths: symbolic (Dempster-Shafer theory) and neural (LLM). SOTA results on PubMedQA, MedQA, MedMCQA across 5 LLM backbones.

RAGReasoningEvals
SIG
78
HYP
15
arXiv cs.CL·

Systematic Evaluation of the Quality of Synthetic Clinical Notes Rephrased by LLMs at Million-Note Scale

Systematic evaluation of synthetic clinical notes generated by LLMs at million-note scale from MIMIC databases. Study shows synthetic notes preserve core clinical information for coarse-grained tasks but lose fine-grained details for ICD coding. Chunk-based rephrasing mitigates detail loss but reduces factual precision under incomplete context.

BenchmarksEvalsAI safety
SIG
78
HYP
15
arXiv cs.CL·

PROTEA: Offline Evaluation and Iterative Refinement for Multi-Agent LLM Workflows

PROTEA is an interface for offline debugging and refinement of multi-agent LLM workflows. It evaluates intermediate outputs with configurable rubrics, localizes bottlenecks via workflow graph visualization, and generates targeted prompt revisions. On two production-adjacent workflows, PROTEA improves document-inspection accuracy from 64.3% to 83.9% and recommendation Hit@5 from 0.30 to 0.38.

Multi-agentAI AgentsPrompt engineering
SIG
78
HYP
18
arXiv cs.AI·

Reliability and Effectiveness of Autonomous AI Agents in Supply Chain Management

Study of autonomous AI agents in multi-echelon supply chains using MIT Beer Game. Reasoning models reduce costs by 67% vs human teams, but reveal an 'agent bullwhip effect': amplification of decision unreliability across echelons. A GRPO-based reinforcement-learning post-training framework using system-level rewards improves reliability and reduces tail events.

AI AgentsMulti-agentReasoning
SIG
78
HYP
25
arXiv cs.AI·

MADP: A Multi-Agent Pipeline for Sustainable Document Processing with Human-in-the-Loop

MADP is a multi-agent architecture for enterprise document automation, combining deep learning classification and LLM extraction with human validation. Deployed on 955 real documents, it achieves 97% full-pipeline automation and reduces FTE requirements by 70%. 98.5% document-level accuracy with human-in-the-loop; 69% CO2 reduction vs manual processing.

Multi-agentAI AgentsCode generation
SIG
78
HYP
25
arXiv cs.AI·

CyberCorrect: A Cybernetic Framework for Closed-Loop Self-Correction in Large Language Models

CyberCorrect formalizes LLM self-correction as a closed-loop control system. A tri-modal error detector (self-consistency, verbalized confidence, logic-chain verification) and type-directed correction controller achieve 79.8% accuracy on CyberCorrect-Bench (440 reasoning tasks), +6.2pp over existing methods, reducing overshoot by 41% via convergence control.

ReasoningEvalsPapers
SIG
78
HYP
25