Page 62 of 147

AllHigh signalRecent
5862 articles
arXiv cs.AI·

Neural Estimation of Pairwise Mutual Information in Masked Discrete Sequence Models

Neural method to estimate pairwise conditional mutual information in masked diffusion models (MDMs). Framework uses hidden states from pretrained MDMs with supervision from ground-truth MI computed from model's conditional distributions. Applied to Sudoku and protein sequence generation (ESM-C), reduces inference forward passes by 3-5x via MI-guided parallel decoding while outperforming entropy-based methods.

PapersReasoningCode generation
SIG
72
HYP
18
arXiv cs.CL·

Ishigaki-IDS-Bench: A Benchmark for Generating Information Delivery Specification from BIM Information Requirements

Ishigaki-IDS-Bench is a benchmark for evaluating generation of Information Delivery Specification (IDS) XML files from BIM requirements. On 166 expert-validated examples in English/Japanese, the 10 best LLMs reach 65.6% macro F1 for content agreement, but only 27.7% pass the IDS Content audit. Models struggle to generate XML conforming to IDS standards and IFC vocabulary constraints.

BenchmarksCode generationPapers
SIG
72
HYP
15
arXiv cs.CL·

Evaluation of Chunking Strategies for Effective Text Embedding in Low-Resource Language on Agricultural Documents

Comparative study of four chunking strategies (Recursive, Khmer-Aware, Sentence-Based, LLM-Based) for RAG on Khmer agricultural documents. Recursive chunking with 300 characters achieves best performance: L2 distance 0.4295, Answer Relevance 0.8663, Khmer IoU 0.6441. Statistically significant improvement over Sentence-Based (p=0.0121).

RAGEmbeddingsBenchmarks
SIG
72
HYP
15
arXiv cs.CL·

Broadening Access to Transportation Safety Data with Generative AI: A Schema-Grounded Framework for Spatial Natural Language Queries

A framework uses an LLM to translate natural language queries into deterministic spatial operations against a PostGIS database. Tested on Massachusetts transportation safety data (crash records, roadway attributes, schools, bus stops), the system validates 29% of erroneous queries through a rule-based layer, preserving reproducibility while democratizing data access.

RAGAI AgentsEvals
SIG
72
HYP
25
arXiv cs.LG·

PeakFocus: Bridging Peak Localization and Intensity Regression via a Unified Multi-Scale Framework for Electricity Load Forecasting

PeakFocus is a unified framework for electricity load peak forecasting (ELPF), simultaneously predicting peak timing and intensity. It combines a peak-aware pipeline with triple hybrid loss, a multi-scale peak locator, and a location-aware decoder to overcome two-stage approach limitations. Evaluated on ELC and WLEL datasets.

BenchmarksPapers
SIG
72
HYP
18
arXiv cs.LG·

Equilibrium Propagation and Hamiltonian Inference in the Diffusive Fitzhugh-Nagumo Model

Extension of Equilibrium Propagation framework to skew-gradient systems with demonstrated equivalence between deep Energy-Based Models and Hamiltonian neural networks. Applied to diffusively coupled Fitzhugh-Nagumo neuron networks, showing stationary solutions admit spatial Hamiltonian structure enabling Hamiltonian Echo Backpropagation methods.

PapersReasoningReinforcement learning
SIG
72
HYP
15
arXiv cs.AI·

VBFDD-Agent for Electric Vehicle Battery Fault Detection and Diagnosis: Descriptive Text Modeling of Battery Digital Signals

VBFDD-Agent is a vehicle battery fault detection and diagnosis agent for electric vehicles using large language models. The system converts lithium-ion battery signals into natural language descriptions, integrates historical case retrieval and local maintenance manuals to generate structured, interpretable diagnostic results and maintenance recommendations.

AI AgentsRAGReasoning
SIG
72
HYP
28
arXiv cs.AI·

Declarative Data Services: Structured Agentic Discovery for Composing Data Systems

DDS (Declarative Data Services) is an architecture for structured agentic discovery of data-system compositions. Addressing unbounded agentic discovery failures, the framework decomposes search into typed sub-searches via four contracts (intent, operator DAG, skills, runtime attribution). Tested on a trading-backend workload, DDS converges where unbounded approaches fail.

AI AgentsMulti-agentPapers
SIG
72
HYP
18
arXiv cs.LG·

I-SAFE: Wasserstein Coherence Metrics for Structural Auditing of Scientific AI Models

I-SAFE is a post-hoc auditing framework for scientific AI models based on the Wasserstein Coherence Metric (WCM). It evaluates whether model predictions reflect domain structure or exploit statistical shortcuts. Tested on drug-target interaction prediction (DeepConvDTI, DeepDTA, TAPB), it reveals distinct distributional response profiles invisible to accuracy metrics.

EvalsAI safetyAlignment
SIG
72
HYP
15