Page 49 of 144

AllHigh signalRecent
5739 articles
arXiv cs.AI·

Model-Native Computing Architecture: Envisioning Future System Architecture Through the Lens of Computer Architecture

Survey paper proposing Intelligent Computing Architecture Model (ICAM), a six-layer framework for model-native computing. Maps classical computer architecture concepts to LLM systems (cache management, context, agents). Introduces three design laws: Semantic Locality Law, Context Budget Law, Agent Speedup Law. Distinguishes probabilistic execution plane from deterministic control plane.

AI AgentsMulti-agentReasoning
SIG
72
HYP
25
arXiv cs.CL·

TCAR-Gen: Temporal Graph Retrieval with Evidence Fusion for Knowledge-Grounded Generation

TCAR-Gen combines query-conditioned graph neural networks, temporal evidence fusion, and chain-of-trees reasoning for retrieval-augmented generation. Achieves 0.3738 Recall@5 on Victorian Crime Diaries benchmark, outperforming Vanilla RAG, Temporal RAG, and GraphRAG variants. Cross-model evaluation across GPT-OSS 20B to TinyLlama 1.1B shows robust retrieval coverage at smaller scales.

RAGReasoningBenchmarks
SIG
72
HYP
18
arXiv cs.CL·

AEyeDE: An Attention-Based Attribution Framework for AI-Generated Text Detection

AEyeDE introduces an attention-based attribution framework for detecting AI-generated text using attention matrices from a proxy Transformer model. A lightweight CNN learns discriminative representations from these attribution maps. The method outperforms text-only baselines, shows strong generator-specific detection, and demonstrates robustness under cross-dataset transfer and spelling perturbations.

PapersAI safetyEvals
SIG
72
HYP
18
arXiv cs.CL·

Toward Robust In-Context Learning: Leveraging Out-of-distribution Proxies for Target Inaccessible Demonstration Retrieval

DOPA, a demonstration retrieval framework, uses an OOD proxy to approximate the inaccessible target domain and guide selection of relevant demonstrations. A Mahalanobis distance-based global diversity constraint ensures sufficient variety among retrieved examples. Positive results across multiple LLMs and tasks under severe distribution shift.

Prompt engineeringBenchmarksPapers
SIG
72
HYP
18
arXiv cs.AI·

Product-Aware Deep Autoencoders for Robust Process Monitoring in Multi-Product Cyber-Physical Systems

Academic paper proposing product-aware autoencoders for anomaly detection in multi-product cyber-physical systems. Traditional global models create blind spots where attacks can evade detection. Tests on Tennessee Eastman Process benchmark: product-aware model achieves 100% detection accuracy versus 22.2% for global baseline in attack scenarios.

BenchmarksAI safetyEvals
SIG
72
HYP
15
arXiv cs.CL·

BOUTEF: A Multilingual Corpus for FakeNews in North Africa -- Language as a Weapon

BOUTEF is a multilingual corpus from 2 countries (Algeria, Tunisia) covering fake news, authentic narratives, comments, and debunking. Includes MSA, Algerian/Tunisian dialects, Arabizi, French, English, and code-switching. Analysis shows fake news relies on emotionally charged narratives and sensational framing, while debunking adopts a factual, verification-oriented style.

PapersBenchmarksAI safety
SIG
72
HYP
18
arXiv cs.LG·

Adversarially Robust Control of Conditional Value-at-Risk via Rockafellar-Uryasev Conformal Inference

Online, distribution-free framework for controlling Conditional Value-at-Risk (CVaR) in non-stationary and adversarial environments. Combines conformal tail risk control, online learning, and Rockafellar-Uryasev variational representation. Provable safety guarantees for nonlinear tail risk under arbitrary data-generating processes. Applications: portfolio risk management and LLM toxicity mitigation.

PapersAI safetyReasoning
SIG
72
HYP
15
Reddit r/MachineLearning·

How much of MLE-Bench's gains are the algorithm vs. better models + more search? [R]

MLE-Bench shows 80% gains over two years, but new research (FML-Bench) reveals little comes from real algorithmic progress. At equal step budget and identical models, the two-year-old AIDE algorithm matches modern agent/evolutionary search systems. FML-Bench unifies code editing agents, step definitions, and val/test splits to benchmark algorithmic efficiency.

BenchmarksAI AgentsEvals
SIG
72
HYP
25