Sharing my 'Local-LLM-Toolkit' repo
GitHub repo 'Local-LLM-Toolkit' shared documenting optimization techniques for local LLMs on Mac Studio M4 Max 128GB. Includes C and Swift code for performance improvements.
GitHub repo 'Local-LLM-Toolkit' shared documenting optimization techniques for local LLMs on Mac Studio M4 Max 128GB. Includes C and Swift code for performance improvements.
Call for papers for the U&ME (Unlearning & Model Editing) workshop at ECCV 2026. Organizers seek submissions on unlearning, model editing, model merging, compression, and lifelong learning. Work-in-progress and exploratory ideas welcome.
Gstack: 23 opinionated Claude Code tools configured from Garry Tan's setup, covering CEO, designer, engineering manager, release manager, doc engineer, and QA roles.
Gstack: Garry Tan's Claude Code setup with 23 opinionated tools automating CEO, designer, engineering manager, release manager, doc engineer, and QA roles.
OpenBB is an open-source financial data platform for analysts, quants and AI agents. It provides unified access to market data through a GitHub-hosted repository.
TrendRadar is an AI-driven trend monitor aggregating multi-platform news via RSS with smart alerts. Filters by keywords, translates and analyzes articles via AI, supports MCP for natural language dialogue, Docker deployment with local/cloud data, integrations with WeChat/Feishu/DingTalk/Telegram/Slack.
George Hotz warns that AI coding agents will be "one of the most costly mistakes" in software development. After six months of testing, he concludes LLMs deliver fast prototypes but generate bugs that become increasingly hard to spot. His stance reflects deep divisions in the AI community over LLMs' role.
User reports 1000 tokens/s generation on Qwen 3.6 27B with V100s at batch 128, and 80 t/s single-user (batch 1) without MTP. Processing throughput reaches 3000 t/s.
RAG4Outcome is a multimodal RAG framework for prognostic prediction in chronic osteomyelitis. It integrates PET-CT imaging reports, surgical records, and follow-up notes into a unified pipeline with domain-specific retrieval corpus and expert-guided prompting. Preliminary results on real-world cases demonstrate effectiveness and clinical alignment.
World Machine is a transformer-based generative world-modeling architecture with latent states for time series. It reduces the quadratic complexity of standard transformers and adapts to varying amounts of observed data. Validated on synthetic dataset Toy1D.
Theoretical paper proposing a Cognitive Kardashev Scale to quantify AI compute capacity civilisations could sustain. Based on four parameters (total power, cognition share, energy efficiency, brain reference), the study estimates current humanity at K≈0.73 (Type I). At Type I with 1% power allocation, each human would have access to one personal AI's worth of cognition.
Theoretical work adapting rare switching to linear bandits with Gaussian noise for privacy. Standard determinant-based rules fail due to loss of design matrix monotonicity. Proposed solution: generalized Rayleigh quotient-based rule, validated by Codex.
Foundation Protocol introduces a coordination layer for interacting autonomous agents. The system unifies agents, tools, resources, humans, and institutions through a graph structure, supports multi-agent collaboration and economic primitives (metering, settlement). Designed to integrate with existing protocols while ensuring auditability and accountability.
Hugging Face clarifies AI agent terminology: distinguishing harness (execution infrastructure), scaffold (coordination structure), and agent (autonomous system). Essential definitions to avoid confusion in the ecosystem.
MergeNB is a VS Code extension to resolve merge conflicts in Jupyter notebooks. Built as an alternative to nbdime, it features a web UI and will be expanded to work as a git mergetool this summer.
Simon Willison used Claude to recreate Mad House, a 1980s game from Usborne's « Creepy Computer Games » book (1983), as an interactive JavaScript/HTML version with retro interface. UK publisher Usborne released free PDFs of its 1980s computer books.
Memory now accounts for nearly two-thirds of AI chip component costs. This trend reflects growing bandwidth and storage requirements for increasingly large models.
JetBrains and Microsoft release official Kotlin support for Visual Studio Code in alpha. The extension provides code completion, navigation, and debugging for the Kotlin language.
DeepSeek releases Reasonix, a native coding agent optimized for high caching and low cost. The model leverages DeepSeek's reasoning capabilities with a specialized architecture for code generation tasks.
Study on the fragility of LLM agents in backend code generation. Constraints imposed on models degrade progressively, reducing their ability to respect technical specifications. Critical issue for production systems.
Developer builds cgo-free CUDA binding for Go using purego to load libcuda.so at runtime. Solves thread affinity issues with runtime.LockOSThread and channel-based executor. Early-stage weekend project adding multi-GPU and Graphs support. Repo: github.com/eitamring/gocudrv.
Qwen 3.6-35B in non-MTP version plays DCSS (open-source roguelike) effectively. Practical test on LM Studio with RTX 5090: Minotaur character level 5, 47 HP, multiple enemies defeated. MTP version produces malformed tool calls. Alternative benchmark to official scores.
Pi is an AI agent toolkit providing a coding agent CLI, unified LLM API, TUI & web UI libraries, Slack bot, and vLLM pods support.
Sail is an Apache Spark replacement written in Rust, unifying batch processing, stream processing, and compute-intensive AI workloads.
Modrinth releases its complete monorepo on GitHub. The repository contains the full source code powering the Modrinth mod distribution platform.
Plano is an AI-native proxy and data plane for agentic applications, featuring built-in orchestration, safety, observability, and intelligent LLM routing.
Vibe-Kanban is an open-source tool that amplifies productivity of coding agents like Claude Code and Codex through a Kanban interface. Enables managing development tasks with AI agents.
Trigger.dev is a platform to build and deploy fully-managed AI agents and workflows. The trending GitHub project provides complete infrastructure for orchestrating autonomous agents in production.
Presenton is an open-source AI presentation generator with API, positioned as an alternative to Gamma, Beautiful AI, and Decktopus. The GitHub project offers an automated solution for creating slideshows.
Twenty is an open-source alternative to Salesforce designed for AI. The project gains traction on GitHub Trending, positioning open-source CRMs as viable competitors to proprietary solutions.
Pi is an AI agent toolkit featuring a coding agent CLI, unified LLM API, TUI and web UI libraries, Slack bot, and vLLM pods support.
Onyx is an open-source AI platform for chat supporting multiple LLMs with advanced features. Available on GitHub, it enables integration with various language models.
Qwen 3.6-35B quantized in GGUF and Safetensors, tested on Beelink GTR9 Pro with 200k token context. No glitches, loops, or repeated tool calls observed. MTP support, uncensored. APEX quantizations recommended.
User runs accounting tasks (monthly closes, bank reconciliations) with Qwen 3.6 27B locally, integrated with Claude and Anthropic's financial-services repo. Despite limited GPU, the model delivers reliable results, demonstrating growing maturity of local LLMs for professional use cases.
Polsia raised $30M but a source reveals questionable practices: fake ARR, inactive users counted, unauthorized admin access to customer accounts.
Three new Linux vulnerabilities (Dirty Frag, Copy Fail, Fragnesia) expose a worrisome trend of kernel security flaws. These bugs affect memory management and fragmentation, leaving systems exposed to critical exploits.
Top 10 fastest-growing AI repos: codegraph (+14.1K stars) for local code knowledge graphs, openhuman (+17.1K) for personal AI, academic-research-skills (+11.6K) for Claude Code, plus agent memory, multilingual TTS, stealth browser automation, and agentic video generation tools.
Texas sues Meta and WhatsApp for misleading claims about encryption and privacy. The lawsuit challenges marketing statements regarding user data protection.
Multica is an open-source managed agents platform. It turns coding agents into real teammates — assign tasks, track progress, compound skills.
grpc-rust: native gRPC client & server implementation with async/await support. Open-source Rust project.