Page 80 of 149

AllHigh signalRecent
5942 articles
arXiv cs.AI·

Multi-Party Multi-Objective Optimization as Consensus Search: Runtime Analysis of Cross-Party Recombination

Theoretical study of multi-objective evolutionary algorithms for multi-party optimization (MPMOP). On MP-JCG benchmark, payoff-guided mutation requires Θ(n²) fitness evaluations to cross a gap region, while CPR-NSGA-II achieves O(n log n) via cross-party recombination. Runtime analysis on BPBOMST (multi-party minimum spanning tree) with instance-parameterized bounds.

Multi-agentBenchmarksPapers
SIG
72
HYP
08
arXiv cs.AI·

Peak-Detector: Explainable Peak Detection via Instruction-Tuned Large Language Models in Physiological Sign

Peak-Detector leverages instruction-tuned LLMs for peak detection across physiological signals (ECG, PPG, BCG, BSG) with explainability. A "peak-representation" technique compresses time-series while preserving critical events. The model is optimized via supervised fine-tuning then multi-objective reinforcement learning, evaluated on 7 datasets (6 public benchmarks + 1 real-world cohort).

ReasoningFine-tuningReinforcement learning
SIG
72
HYP
25
arXiv cs.AI·

From Imitation to Interaction: Mastering Game of Schnapsen with Shallow Reinforcement Learning

Shallow neural network agents master the card game Schnapsen through reinforcement learning. RLBot, trained via asynchronous Monte Carlo updates, outperforms MLPBot (supervised imitation) and achieves statistically significant wins against RdeepBot, a search-based baseline. Combining learned value functions with deeper lookahead during gameplay improves performance.

Reinforcement learningBenchmarksPapers
SIG
72
HYP
15
arXiv cs.AI·

Distinguishable Deletion: Unifying Knowledge Erasure and Refusal for Large Language Model Unlearning

Distinguishable Deletion (D²) unifies knowledge deletion and refusal for LLM unlearning. The method uses an energy index to erase undesirable knowledge in latent representations rather than specific tokens, avoiding biased deletion and re-emergence of harmful content. Energy-based Unlearning Alignment (EUA) applies this mechanism at training and inference.

AI safetyAlignmentPapers
SIG
72
HYP
25
arXiv cs.CL·

"The Whole Is Greater Than the Sum of Its Parts": A Compatibility-Aware Multi-Teacher CoT Distillation Framework

COMPACT, a multi-teacher CoT distillation framework, adaptively fuses supervisions from multiple LLMs into compact student models. It dynamically weights teacher gradients using three metrics: graph-based consensus, mutual-information-based adaptability, and loss-based difficulty. Achieves SOTA results across benchmarks while mitigating catastrophic forgetting.

ReasoningFine-tuningPapers
SIG
72
HYP
25
Reddit r/MachineLearning·

We built a tool that installs frameworks like ComfyUI, Ollama, OpenWebUI etc on any cloud GPU in one command and saves your whole setup between sessions [R]

swm is an open-source tool automating framework installation (ComfyUI, Ollama, OpenWebUI, vLLM) on cloud GPUs in one command. It aggregates pricing across 10+ providers (RunPod, Vast.ai, Lambda), syncs workspaces via S3, and auto-terminates idle instances after 30 min to cut costs.

ToolsOpen sourceInfrastructure
SIG
72
HYP
35
Reddit r/MachineLearning·

Scaling LLMs horizontally: hidden-state coupling without weight modification [R]

Residual Coupling (RC) connects frozen language models in parallel via lightweight learned linear projections, without weight modification. Linear bridges read hidden states from one model and inject additive updates into another's residual stream. On medical data, RC reduces perplexity to 11.02 vs 56.80 for MoE (+80.7%), and improves TruthfulQA by 9.1 percentage points.

LlamaMulti-agentFine-tuning
SIG
72
HYP
28