Page 30 of 140

AllHigh signalRecent
5589 articles
GitHub Trending·

<svg aria-hidden="true" data-component="Octicon" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-repo mr-1 tmp-mr-1 color-fg-muted"> <path d="M2 2.5A2.5 2.5 0 0 1 4.5 0h8.75a.75.75 0 0 1 .75.75v12.5a.75.75 0 0 1-.75.75h-2.5a.75.75 0 0 1 0-1.5h1.75v-2h-8a1 1 0 0 0-.714 1.7.75.75 0 1 1-1.072 1.05A2.495 2.495 0 0 1 2 11.5Zm10.5-1h-8a1 1 0 0 0-1 1v6.708A2.486 2.486 0 0 1 4.5 9h8ZM5 12.25a.25.25 0 0 1 .25-.25h3.5a.25.25 0 0 1 .25.25v3.25a.25.25 0 0 1-.4.2l-1.45-1.087a.249.249 0 0 0-.3 0L5.4 15.7a.25.25 0 0 1-.4-.2Z"></path> </svg> <span data-view-component="true" class="text-normal"> NVlabs /</span> Sana

NVIDIA Labs releases Sana, a linear diffusion transformer for efficient high-resolution image synthesis. Architecture reduces computational complexity while maintaining visual quality.

Image generationOpen sourcePapers
SIG
75
HYP
25
arXiv cs.AI·

UniversalRAG: Retrieval-Augmented Generation over Corpora of Diverse Modalities and Granularities

UniversalRAG is a multi-modal RAG framework that retrieves and integrates knowledge from heterogeneous sources (text, images, videos) at variable granularities. It introduces modality-aware routing to avoid intra-modal bias and organizes each modality into granularity levels. Validated on 10 benchmarks, it outperforms single-modality and unified baselines.

RAGVisionVideo generation
SIG
75
HYP
25
arXiv cs.AI·

The Alpha Illusion: Reported Alpha from LLM Trading Agents Should Not Be Treated as Deployment Evidence

Critical study of LLM-based trading agents (FinCon, FinMem, TradingAgents, FinAgent, QuantAgent, FLAG-Trader). Reported Sharpe ratios do not constitute deployment evidence: temporal contamination, unmodeled frictions, and insufficient predictive calibration invalidate claims. Proposes P1-P6 protocol and modular architecture with LLM as audit interface.

AI AgentsBenchmarksEvals
SIG
75
HYP
15
arXiv cs.AI·

Data Presentation Over Architecture: Resampling Strategies for Credit Risk Prediction with Tabular Foundation Models

Comparative study of tabular foundation models (TFMs) vs classical models on credit default prediction. On Home Credit and Lending Club datasets, context construction strategy (balanced vs uniform sampling) explains more AUC-ROC variance than model choice: +3-4 AUC points. With 5K-10K balanced examples, TFMs match classical GBDTs while improving default-class recall.

Benchmarks
SIG
75
HYP
15
arXiv cs.AI·

Randomized Advantage Transformation (RAT): Computing Natural Policy Gradients via Direct Backpropagation

RAT (Randomized Advantage Transformation) estimates Tikhonov-regularized natural policy gradients via direct backpropagation without explicit Fisher matrix construction. The method applies the Woodbury formula and randomized block Kaczmarz iterations on on-policy mini-batches. Results match or exceed established natural-gradient methods on continuous and visual control benchmarks.

Reinforcement learningReasoningPapers
SIG
75
HYP
15
arXiv cs.AI·

Ensembling Tabular Foundation Models - A Diversity Ceiling And A Calibration Trap

Six modern tabular foundation models form a highly redundant ensemble (mean Q-statistic 0.961). On 153 OpenML classification tasks, the best ensemble (two-level cascade stacking) gains +0.18% accuracy at 253× compute cost. Friedman-Nemenyi analysis places three ensembles and the best single model in the same equivalence group. Greedy selection is recommended as practical default.

BenchmarksPapers
SIG
75
HYP
15
arXiv cs.CL·

Generalization or Memorization? Brittleness Testing for Chess-Trained Language Models

Researchers train KinGPT (25M parameters) on chess data and demonstrate that high benchmark scores of chess-trained LLMs stem primarily from pattern-matching rather than genuine rule understanding. LLM-Modulo, a verifier-in-the-loop framework, improves RedPajama 3B from 1.2% to 21.2% best-move accuracy. Training code, datasets, and model checkpoints open-sourced.

BenchmarksEvalsFine-tuning
SIG
75
HYP
25
arXiv cs.AI·

CheckSupport: A Local LLM-Powered Tool for Automated Manuscript Submission Checklist Selection and Completion

CheckSupport is an open-source system using locally-deployed LLMs to automate reporting checklist recommendation and completion for scientific manuscripts. Evaluated on peer-reviewed manuscripts, it achieves 90% accuracy for checklist recommendations and 88% for item-level completion, processing each manuscript in 12.5 seconds on CPU-only hardware.

LlamaPrompt engineeringEvals
SIG
75
HYP
15
arXiv cs.CL·

Self-Distilled Trajectory-Aware Boltzmann Modeling: Bridging the Training-Inference Discrepancy in Diffusion Language Models

TABOM, a post-training method for Diffusion Language Models, aligns optimization with the multi-step easy-to-hard decoding trajectory observed at inference. Via Boltzmann modeling of unmasking preferences, it derives a tractable pairwise ranking objective that reduces training-inference discrepancy and improves performance on new domains.

Fine-tuningReasoningPapers
SIG
75
HYP
15
arXiv cs.AI·

Beyond Imperfect Alternatives with Rulemapping: A Neuro-Symbolic Case Study on Online Hate Speech

Neuro-symbolic study comparing LLMs constrained by deterministic logic scaffolds (Rulemapping) versus unconstrained prompting for hate speech moderation under German Criminal Code (§130). Rulemapping achieves precision 0.80-0.86 and recall 0.82-0.89 versus 0.34-0.49 with unconstrained prompting, eliminating conflation of moral offense with legal illegality.

ReasoningAI safetyRegulation
SIG
75
HYP
15
arXiv cs.AI·

EmoMind: Decoding Affective Captions from Human Brain fMRI

EmoMind decodes affective captions directly from brain fMRI signals. The system first retrieves a neutral scene description from brain-decoded visual features, then rewrites it using a continuous 34-dimensional emotion vector extracted from the same fMRI recording. Evaluated on two independent emotion fMRI datasets, EmoMind outperforms GPT-4 with discrete emotion labels across all validation axes.

VisionReasoningEvals
SIG
75
HYP
25
arXiv cs.CL·

Dynamic Generation of Multi-LLM Agents Communication Topologies with Graph Diffusion Models

Guided Topology Diffusion (GTD) uses graph diffusion models to dynamically generate optimal communication topologies for multi-agent LLM systems. The iterative framework, guided by a proxy model predicting multi-objective rewards (accuracy, utility, cost), adapts topologies to tasks without gradient-based optimization, outperforming static approaches.

Multi-agentAI AgentsBenchmarks
SIG
75
HYP
25