Page 50 of 144

AllHigh signalRecent
5740 articles
Reddit r/LocalLLaMA·

I was a Data Scientist for 10 years before becoming a quadriplegic. For the past 3 months, I built VibeETL from scratch: A lightning-fast, visual Alteryx alternative powered by Polars & React Flow.

VibeETL: open-source visual ETL platform built in 3 months by former data scientist. Polars + Rust backend, React Flow frontend with native BFS layout algorithm. Zero external dependencies, sandboxed Python execution (30s timeout). Lightweight Alteryx alternative.

Open sourceToolsInfrastructure
SIG
72
HYP
45
arXiv cs.LG·

Unicorn: Scaling High-Dimensional Time Series Forecasting via Universal Correlation Modeling

Unicorn, a multi-dataset pretraining framework, bridges the trade-off between channel-independent models (scalable but ignoring dependencies) and channel-dependent models (expressive but dimension-bounded). Using a latent prototype codebook, it projects heterogeneous channels into a shared space to learn identity-agnostic, reusable correlation patterns transferable across domains.

PapersBenchmarksFine-tuning
SIG
72
HYP
28
arXiv cs.AI·

COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation

COLLEAGUE.SKILL is an automated trace-to-skill distillation system for generating person-grounded AI skills via expert knowledge extraction. The system produces versioned packages with two coordinated tracks: capability (practices, mental models, decision heuristics) and bounded behavior (communication style, interaction rules). 18.5k GitHub stars, 215 skills from 165 contributors.

AI AgentsPrompt engineeringOpen source
SIG
72
HYP
25
arXiv cs.AI·

Healthcare Mechanisms from Policy-as-Code Search under Strategic Provider Response

Researchers reframe healthcare mechanism design as program synthesis for LLMs. Medi-Sim, a multi-agent simulator, evaluates rule programs against strategic provider responses (coding, selection, delay, effort, triage). LLM-guided evolutionary code search synthesizes a mixed-objective program that eliminates up-coding, halves rejections, and retains baseline profitability.

AI AgentsMulti-agentCode generation
SIG
72
HYP
25
arXiv cs.LG·

Gait2Hip-60: A Unified Deep Learning Benchmark for Predicting Hip Muscle Forces and Joint Moments from Multi-Cadence Gait Kinematics

Unified Gait2Hip-60 benchmark comparing LSTM, Transformer, and Mamba to predict hip muscle forces and joint moments from gait kinematics. Transformer outperforms other models (R²=0.819 for forces, R²=0.862 for moments). External validation on 9 femoral head osteonecrosis patients shows moderate generalization (R²=0.537–0.569).

BenchmarksReasoning
SIG
72
HYP
18
arXiv cs.AI·

Learning to Adapt: Self-Improving Web Agent via Cognitive-Aware Exploration

SCALE is a self-improving framework for web agents using MLLMs. It employs three adversarial roles (Selector, Predictor, Judger) to autonomously explore agent limitations and expand cognitive boundaries. SCALE-Hop optimizes global planning via graph exploration. A SCALE-20k dataset from 19 real websites with 20k structured demonstrations validates the approach across multiple MLLMs.

AI AgentsVisionReinforcement learning
SIG
72
HYP
35
arXiv cs.AI·

Gradient-Free Training of Spiking Neural Networks via Low-Rank Evolution Strategies

EGGROLL, a low-rank factorization of Evolution Strategies perturbations, reduces memory complexity from O(mn) to O(r(m+n)) for gradient-free training of Spiking Neural Networks. On N-MNIST, the method achieves 79.21% test accuracy with 2.23× speedup versus full-rank ES, enabling on-chip learning on neuromorphic hardware without surrogate gradients.

PapersBenchmarksReinforcement learning
SIG
72
HYP
15
arXiv cs.AI·

When LLM Reward Design Fails: Diagnostic-Driven Refinement for Sparse Structured RL

arXiv study on iterative refinement of LLM-generated reward functions for sparse structured RL. Authors identify two dominant failure modes (reward flooding, semantic misunderstanding) and propose diagnostic-driven refinement guided by failure-mode taxonomy. Results: DoorKey-8x8 improves from 2.3% to 97.6%, KeyCorridor from 31.2% to 86.7%. Limitations: method restricted to PPO and sparse structured tasks.

Reinforcement learningLlamaPrompt engineering
SIG
72
HYP
18
arXiv cs.AI·

Diagnosing Failure Modes of Shared-State Collaboration in Resource-Constrained Visual Agents

CoSee, an auditing framework, analyzes failure modes of modular visual reasoning systems using shared working memory. On 4B–8B models, two dominant failure modes emerge: Noise Reinforcement (reusing ungrounded notes) and Policy Collapse (under-specified answers). The study shows naive shared workspaces amplify hallucinations without explicit verification.

VisionAI AgentsMulti-agent
SIG
72
HYP
18
arXiv cs.LG·

Scientific Machine Learning for Engine Health Management and Remaining Useful Life Prediction

Scientific ML framework for turbine Remaining Useful Life (RUL) prediction. Shared encoder (CNN + bidirectional LSTM + attention pooling) with task-specific heads predicts turbine gas temperature, Delta TGT, and RUL with quantified uncertainty intervals. Evaluated on heterogeneous real-world fleet data using MAE, PICP, MPIW, and coverage-width criterion metrics.

ReasoningMulti-agentBenchmarks
SIG
72
HYP
15
arXiv cs.AI·

HADT: A Heterogeneous Multi-Agent Differential Transformer for Autonomous Earth Observation Satellite Cluster

Novel transformer-based architecture for autonomous resource management in heterogeneous satellite clusters (optical and SAR). Uses model-free reinforcement learning for real-time decision-making in Earth Observation missions. Demonstrates significant performance improvements and transferability across varying cluster sizes.

Multi-agentReinforcement learningReasoning
SIG
72
HYP
15