Page 11 of 137

AllHigh signalRecent
5473 articles
Reddit r/LocalLLaMA·

SkillOpt treats markdown skill files as trainable parameters with proper optimization machinery

SkillOpt formalizes markdown skill file optimization as trainable parameters via bounded edits (add/delete/replace) proposed by a frontier model and validated against a held-out test set. Best skills converge with 1–4 accepted edits from ~920 tokens. A skill optimized on Codex transfers to Claude Code (+59.7 SpreadsheetBench) without modification.

AI AgentsPrompt engineeringCode generation
SIG
78
HYP
25
arXiv cs.LG·

Interdomain Attention: Beyond Token-Level Key-Value Memory

Interdomain Attention merges transformers and state space models via kernel methods: attention features are projected onto basis functions maintained by an SSM, enabling query-conditioned attention over fixed-size state. On FineWeb-Edu (125M–1.3B), outperforms softmax baselines at 1.3B on validation perplexity and commonsense tasks, with length-flat behavior up to 3.5× training context.

ReasoningBenchmarksPapers
SIG
78
HYP
15
arXiv cs.CL·

Document Classification Pattern Recognition via Information Fusion: A Systematic Review of Multimodal and Multiview Representation Approaches

Systematic review of 139 studies on information fusion for document classification. Meta-analysis shows multimodal fusion improves accuracy by +5.28 percentage points (p=0.0016) and multiview fusion by +4.67% accuracy. Critical finding: only 11.8% of multimodal and 23.3% of multiview studies use statistical validation, undermining reproducibility.

BenchmarksEvalsPapers
SIG
78
HYP
15
arXiv cs.AI·

Methods for Formal Verification of Agent Skills: Three Layers Toward a Mechanically Checkable Capability-Containment Proof

Formal verification paper for LLM agent skills. Presents three composable methods: sound static capability-containment analysis via abstract interpretation, refinement type system for tool-call envelopes, and SMT-bounded model checking. Open-source JavaScript implementation (enclawed framework) with 53 unit tests and end-to-end CLI demo.

AI AgentsAI safetyReasoning
SIG
78
HYP
15
arXiv cs.LG·

Spectral Probe-Circuits: A Three-Step Recipe for Identifying Attention-Head Circuits in Pretrained Transformers

Method to identify attention-head circuits in pretrained transformers using spectral signal (time-integrated participation ratio), task-pattern filtering, and group ablation against matched-random control. Validated across 51M to 7B parameters, two architectures, four pretraining pipelines. Finding: 2-6 head induction circuit causally necessary in all models tested (94-100% drop after ablation).

PapersReasoningEvals
SIG
78
HYP
15
arXiv cs.CL·

Direct Preference Optimization for English-Mandarin Code-Switching Speech Recognition in Audio LLMs

Researchers apply Direct Preference Optimization (DPO) to improve English-Mandarin code-switching transcription in Audio LLMs. Three failure modes identified: language omission, translation-instead-of-transcription, hallucination. Training on 100K pairs (570 hours) reduces MER up to 89.6% (in-distribution) and 20.0% (out-of-distribution).

Reinforcement learningAlignmentVoice
SIG
78
HYP
15
Reddit r/MachineLearning·

DCGAN inference on a microcontroller: 12.6M parameters, 512KB SRAM, 26-second generation, pure C [P]

DCGAN with 12.6M parameters runs on RISC-V CH32H417 microcontroller (512KB SRAM). Generates 64×64 cat faces in 26 seconds using pure C inference engine with int8 per-channel quantization. Weights streamed from SD card via double buffering. Z vector seeded with 200 bytes quantum random data (ANU QRNG). No existing frameworks (TFLite, CMSIS NN) — built from scratch.

Code generationBenchmarksOpen source
SIG
78
HYP
25
Reddit r/LocalLLaMA·

Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps

RTPurbo transforms full-attention LLMs into sparse models in hundreds of training steps. The method exploits three observations: only certain heads require full attention, long-range retrieval uses a 16D subspace, and token selection is query-dependent. Results: 9.36x prefill speedup at 1M context, 2.01x decode speedup, accuracy preserved.

ReasoningBenchmarksInfrastructure
SIG
78
HYP
25