May 2026

3149 articles

SAP taps Mistral AI to help customers migrate legacy software

SAP partners with Mistral AI to simplify customer migration to S/4HANA. Mistral AI models help streamline the legacy software migration process.

Mistral Business

SIG

HYP

Reddit r/LocalLLaMA·May 21

Heretic has been served a legal notice by Meta, Inc.

Heretic Free Software Project received a legal notice from Meta regarding Llama model derivatives. The project removed model weights from controlled repositories and is diversifying infrastructure with mirrors on Codeberg and other platforms to preserve access independently of service providers.

Llama Meta AI Open source

SIG

HYP

Reddit r/LocalLLaMA·May 21

Honesty in a small model drops from 35% to 0% by changing the tone of the prompt. Sharing the findings.

A paper published on arXiv shows honesty in small open-source models drops from 35% to 0% by changing prompt tone. When asked to solve mathematically impossible coding problems, models admit impossibility 33% of the time in neutral language but 0% under pressure. Internal analysis reveals each tone leaves a distinct signature in the network's deepest layers.

Papers Alignment AI safety

SIG

HYP

Reddit r/LocalLLaMA·May 21

LlamaStation v0.9 — llama.cpp GUI for Windows with multi-backend support, TurboQuant, MTP and more

LlamaStation v0.9 is a Windows GUI for llama.cpp with multi-backend support (TurboQuant, MTP, AtomicChat, BeeLlama). Runs llama-server directly without intermediate layer, provides full parameter control, real-time VRAM metering, per-model profiles, offline voice mode (XTTS v2 + faster-whisper), headless mode, and auto-updates.

Llama Tools Open source

SIG

HYP

Reddit r/LocalLLaMA·May 21

LLM planner - pick a rig for your use-case/model/budget, or pick models for your rig. 60+ builds, 50+ models, 130+ cited t/s sources, 150+ reviewer YouTube videos, idle+active watts, multi-region prices, regular updates.

LLM Planner is an interactive guide to match hardware or open-weights models. 60+ build configs, 50+ models, sourced tokens/sec, power draw, multi-region pricing, 150+ reviewer YouTube videos. Bidirectional modes: "which rig for this model/budget" or "what models run on my GPU". Data updated weekly, public GitHub repo.

Open source Tools Benchmarks

SIG

HYP

Le Big Data·May 21

Qwen3.7 Max : l’IA d’Alibaba écrase ses anciens scores sur les benchmarks IA

Alibaba's Qwen 3.7 Max improves performance by 4.8 points compared to Qwen 3.6 Max preview on AI benchmarks.

Qwen Benchmarks

SIG

HYP

Hacker News (AI)·May 21

Anthropic to open Milan office, expanding push into Europe

Anthropic opens Milan office to strengthen its European presence. The expansion marks the company's commitment to the European market.

Anthropic Business

SIG

HYP

Hacker News (AI)·May 21

Gemini randomly dumped its system prompt

Google Gemini accidentally exposed its system prompt during a user interaction. The incident reveals the model's internal instructions and raises questions about system prompt security.

Gemini AI safety Prompt engineering

SIG

HYP

Le Big Data·May 21

L’IA, la donnée et le piège de la vitesse : quand l’efficacité néglige la fiabilité

A dbt Labs study reveals that the race for speed in AI sacrifices data reliability. Organizations prioritize immediate efficiency over quality and trust in data pipelines.

RAG Infrastructure AI safety

SIG

HYP

Le Big Data·May 21

Jensen Huang identifie un nouveau marché IA à 200 milliards $ pour Nvidia

Jensen Huang identifies a $200 billion market for agentic AI. Nvidia launches Vera, a processor dedicated to AI agents, to address this segment.

AI Agents

SIG

HYP

Reddit r/LocalLLaMA·May 21

I did what Microsoft wouldn't - updated POML VS Code extension

A developer updated Microsoft's abandoned POML VS Code extension. POML is a markup language for creating modular prompt templates with local AI support. Microsoft dropped support after 2-3 months; a dependency update broke direct LLM sending. The developer used OpenCode to fix the bug and modernize dependencies.

Prompt engineering Tools Open source

SIG

HYP

Reddit r/LocalLLaMA·May 21

Tencent Hy 30B/7B/1.8B

Tencent releases Hy-MT2, a multilingual translation model family in three sizes (1.8B, 7B, 30B-MoE) supporting 33 languages. The 1.8B model compressed to 440 MB via 1.25-bit quantization outperforms commercial APIs from Microsoft and Doubao. The 7B and 30B variants exceed DeepSeek-V4-Pro and Kimi K2.6 performance. Includes IFMTBench benchmark and WMT26 partnership.

Code generation Benchmarks Open source

SIG

HYP

OpenAI Blog·May 21

AdventHealth advances whole-person care with OpenAI

AdventHealth deploys ChatGPT for Healthcare to streamline clinical workflows, reduce administrative burden, and free up time for patient care.

OpenAI Business

SIG

HYP

Hacker News (AI)·May 21

CPPL: A Circuit Prompt Programming Language

CPPL is a circuit-based prompt programming language enabling structured instruction composition through logical operators and control flow. It provides an alternative to traditional text-based prompting for complex AI interactions.

Prompt engineering Tools

SIG

HYP

Le Big Data·May 21

Nexos.ai : on a testé l’outil qui veut convaincre votre DSI que l’IA n’est pas une passoire

Nexos.ai offers an AI security tool for CISOs to mitigate risks from enterprise AI usage. The article tests the solution against governance and AI usage control challenges in 2026.

AI safety Business Tools

SIG

HYP

Le Big Data·May 21

Anthropic pourrait dépenser 1,25 milliard $ par mois sur l’infrastructure xAI

Anthropic could spend up to $1.25 billion per month with xAI for infrastructure through 2029. This contract represents a major commitment from Anthropic to Elon Musk's platform.

Anthropic Infrastructure

SIG

HYP

Reddit r/LocalLLaMA·May 21

110 tok/s with 12GB VRAM on Qwen3.6 35B A3B and ik_llama.cpp

ik_llama.cpp outperforms llama.cpp on RTX 4070 Super 12GB: 110 tok/s average vs 90.6 tok/s with Qwen3.6-35B-A3B-IQ4_XS. Better CPU offloading optimization and speculative decoding (MTP) after llama.cpp performance regression post-merge.

Qwen Open source Infrastructure

SIG

HYP

GitHub Trending·May 21

<svg aria-hidden="true" data-component="Octicon" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-repo mr-1 tmp-mr-1 color-fg-muted"> <path d="M2 2.5A2.5 2.5 0 0 1 4.5 0h8.75a.75.75 0 0 1 .75.75v12.5a.75.75 0 0 1-.75.75h-2.5a.75.75 0 0 1 0-1.5h1.75v-2h-8a1 1 0 0 0-.714 1.7.75.75 0 1 1-1.072 1.05A2.495 2.495 0 0 1 2 11.5Zm10.5-1h-8a1 1 0 0 0-1 1v6.708A2.486 2.486 0 0 1 4.5 9h8ZM5 12.25a.25.25 0 0 1 .25-.25h3.5a.25.25 0 0 1 .25.25v3.25a.25.25 0 0 1-.4.2l-1.45-1.087a.249.249 0 0 0-.3 0L5.4 15.7a.25.25 0 0 1-.4-.2Z"></path> </svg> <span data-view-component="true" class="text-normal"> dotnet /</span> skills

GitHub repository providing skills to assist AI coding agents with .NET and C#. Resources for integrating .NET development capabilities into autonomous agent workflows.

AI Agents Code generation Open source

SIG

HYP

GitHub Trending·May 21

<svg aria-hidden="true" data-component="Octicon" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-repo mr-1 tmp-mr-1 color-fg-muted"> <path d="M2 2.5A2.5 2.5 0 0 1 4.5 0h8.75a.75.75 0 0 1 .75.75v12.5a.75.75 0 0 1-.75.75h-2.5a.75.75 0 0 1 0-1.5h1.75v-2h-8a1 1 0 0 0-.714 1.7.75.75 0 1 1-1.072 1.05A2.495 2.495 0 0 1 2 11.5Zm10.5-1h-8a1 1 0 0 0-1 1v6.708A2.486 2.486 0 0 1 4.5 9h8ZM5 12.25a.25.25 0 0 1 .25-.25h3.5a.25.25 0 0 1 .25.25v3.25a.25.25 0 0 1-.4.2l-1.45-1.087a.249.249 0 0 0-.3 0L5.4 15.7a.25.25 0 0 1-.4-.2Z"></path> </svg> <span data-view-component="true" class="text-normal"> ryoppippi /</span> ccusage

ccusage is a CLI tool to analyze token usage and costs from coding agents using local data.

AI Agents Code generation Tools

SIG

HYP

GitHub Trending·May 21

<svg aria-hidden="true" data-component="Octicon" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-repo mr-1 tmp-mr-1 color-fg-muted"> <path d="M2 2.5A2.5 2.5 0 0 1 4.5 0h8.75a.75.75 0 0 1 .75.75v12.5a.75.75 0 0 1-.75.75h-2.5a.75.75 0 0 1 0-1.5h1.75v-2h-8a1 1 0 0 0-.714 1.7.75.75 0 1 1-1.072 1.05A2.495 2.495 0 0 1 2 11.5Zm10.5-1h-8a1 1 0 0 0-1 1v6.708A2.486 2.486 0 0 1 4.5 9h8ZM5 12.25a.25.25 0 0 1 .25-.25h3.5a.25.25 0 0 1 .25.25v3.25a.25.25 0 0 1-.4.2l-1.45-1.087a.249.249 0 0 0-.3 0L5.4 15.7a.25.25 0 0 1-.4-.2Z"></path> </svg> <span data-view-component="true" class="text-normal"> kata-containers /</span> kata-containers

Kata Containers is an open source project building lightweight Virtual Machines that provide container-like performance with VM-level workload isolation and security.

Open source Infrastructure

SIG

HYP

GitHub Trending·May 21

<svg aria-hidden="true" data-component="Octicon" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-repo mr-1 tmp-mr-1 color-fg-muted"> <path d="M2 2.5A2.5 2.5 0 0 1 4.5 0h8.75a.75.75 0 0 1 .75.75v12.5a.75.75 0 0 1-.75.75h-2.5a.75.75 0 0 1 0-1.5h1.75v-2h-8a1 1 0 0 0-.714 1.7.75.75 0 1 1-1.072 1.05A2.495 2.495 0 0 1 2 11.5Zm10.5-1h-8a1 1 0 0 0-1 1v6.708A2.486 2.486 0 0 1 4.5 9h8ZM5 12.25a.25.25 0 0 1 .25-.25h3.5a.25.25 0 0 1 .25.25v3.25a.25.25 0 0 1-.4.2l-1.45-1.087a.249.249 0 0 0-.3 0L5.4 15.7a.25.25 0 0 1-.4-.2Z"></path> </svg> <span data-view-component="true" class="text-normal"> DataDog /</span> pup

Datadog launches Pup, a CLI companion for AI agents with 200+ commands across 33+ Datadog products.

AI Agents Tools Infrastructure

SIG

HYP

GitHub Trending·May 21

<svg aria-hidden="true" data-component="Octicon" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-repo mr-1 tmp-mr-1 color-fg-muted"> <path d="M2 2.5A2.5 2.5 0 0 1 4.5 0h8.75a.75.75 0 0 1 .75.75v12.5a.75.75 0 0 1-.75.75h-2.5a.75.75 0 0 1 0-1.5h1.75v-2h-8a1 1 0 0 0-.714 1.7.75.75 0 1 1-1.072 1.05A2.495 2.495 0 0 1 2 11.5Zm10.5-1h-8a1 1 0 0 0-1 1v6.708A2.486 2.486 0 0 1 4.5 9h8ZM5 12.25a.25.25 0 0 1 .25-.25h3.5a.25.25 0 0 1 .25.25v3.25a.25.25 0 0 1-.4.2l-1.45-1.087a.249.249 0 0 0-.3 0L5.4 15.7a.25.25 0 0 1-.4-.2Z"></path> </svg> <span data-view-component="true" class="text-normal"> google-gemini /</span> gemini-cli

Open-source tool integrating Gemini directly into the terminal. AI agent enabling interaction with Google's model via CLI.

Gemini AI Agents Tools

SIG

HYP

GitHub Trending·May 21

<svg aria-hidden="true" data-component="Octicon" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-repo mr-1 tmp-mr-1 color-fg-muted"> <path d="M2 2.5A2.5 2.5 0 0 1 4.5 0h8.75a.75.75 0 0 1 .75.75v12.5a.75.75 0 0 1-.75.75h-2.5a.75.75 0 0 1 0-1.5h1.75v-2h-8a1 1 0 0 0-.714 1.7.75.75 0 1 1-1.072 1.05A2.495 2.495 0 0 1 2 11.5Zm10.5-1h-8a1 1 0 0 0-1 1v6.708A2.486 2.486 0 0 1 4.5 9h8ZM5 12.25a.25.25 0 0 1 .25-.25h3.5a.25.25 0 0 1 .25.25v3.25a.25.25 0 0 1-.4.2l-1.45-1.087a.249.249 0 0 0-.3 0L5.4 15.7a.25.25 0 0 1-.4-.2Z"></path> </svg> <span data-view-component="true" class="text-normal"> ChromeDevTools /</span> chrome-devtools-mcp

Chrome DevTools MCP integrates Chrome's developer tools into a Model Context Protocol interface for coding agents. Enables agents to inspect, debug, and interact with web pages in real-time.

AI Agents MCP Code generation

SIG

HYP

GitHub Trending·May 21

<svg aria-hidden="true" data-component="Octicon" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-repo mr-1 tmp-mr-1 color-fg-muted"> <path d="M2 2.5A2.5 2.5 0 0 1 4.5 0h8.75a.75.75 0 0 1 .75.75v12.5a.75.75 0 0 1-.75.75h-2.5a.75.75 0 0 1 0-1.5h1.75v-2h-8a1 1 0 0 0-.714 1.7.75.75 0 1 1-1.072 1.05A2.495 2.495 0 0 1 2 11.5Zm10.5-1h-8a1 1 0 0 0-1 1v6.708A2.486 2.486 0 0 1 4.5 9h8ZM5 12.25a.25.25 0 0 1 .25-.25h3.5a.25.25 0 0 1 .25.25v3.25a.25.25 0 0 1-.4.2l-1.45-1.087a.249.249 0 0 0-.3 0L5.4 15.7a.25.25 0 0 1-.4-.2Z"></path> </svg> <span data-view-component="true" class="text-normal"> software-mansion /</span> argent

Argent is an agentic toolkit to control, debug, and profile iOS and Android apps. Built by Software Mansion.

AI Agents Tools Open source

SIG

HYP

GitHub Trending·May 21

<svg aria-hidden="true" data-component="Octicon" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-repo mr-1 tmp-mr-1 color-fg-muted"> <path d="M2 2.5A2.5 2.5 0 0 1 4.5 0h8.75a.75.75 0 0 1 .75.75v12.5a.75.75 0 0 1-.75.75h-2.5a.75.75 0 0 1 0-1.5h1.75v-2h-8a1 1 0 0 0-.714 1.7.75.75 0 1 1-1.072 1.05A2.495 2.495 0 0 1 2 11.5Zm10.5-1h-8a1 1 0 0 0-1 1v6.708A2.486 2.486 0 0 1 4.5 9h8ZM5 12.25a.25.25 0 0 1 .25-.25h3.5a.25.25 0 0 1 .25.25v3.25a.25.25 0 0 1-.4.2l-1.45-1.087a.249.249 0 0 0-.3 0L5.4 15.7a.25.25 0 0 1-.4-.2Z"></path> </svg> <span data-view-component="true" class="text-normal"> google-labs-code /</span> stitch-skills

Stitch-Skills is a library of Agent Skills designed for the Stitch MCP server. Each skill follows the open Agent Skills standard, compatible with Claude Code, Gemini CLI, Cursor, and Antigravity.

AI Agents MCP Claude Code

SIG

HYP

GitHub Trending·May 21

<svg aria-hidden="true" data-component="Octicon" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-repo mr-1 tmp-mr-1 color-fg-muted"> <path d="M2 2.5A2.5 2.5 0 0 1 4.5 0h8.75a.75.75 0 0 1 .75.75v12.5a.75.75 0 0 1-.75.75h-2.5a.75.75 0 0 1 0-1.5h1.75v-2h-8a1 1 0 0 0-.714 1.7.75.75 0 1 1-1.072 1.05A2.495 2.495 0 0 1 2 11.5Zm10.5-1h-8a1 1 0 0 0-1 1v6.708A2.486 2.486 0 0 1 4.5 9h8ZM5 12.25a.25.25 0 0 1 .25-.25h3.5a.25.25 0 0 1 .25.25v3.25a.25.25 0 0 1-.4.2l-1.45-1.087a.249.249 0 0 0-.3 0L5.4 15.7a.25.25 0 0 1-.4-.2Z"></path> </svg> <span data-view-component="true" class="text-normal"> google /</span> adk-samples

Google releases adk-samples, a collection of sample agents built with Agent Development Kit (ADK). Open-source repository to explore agent development capabilities.

AI Agents DeepMind Open source

SIG

HYP

GitHub Trending·May 21

<svg aria-hidden="true" data-component="Octicon" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-repo mr-1 tmp-mr-1 color-fg-muted"> <path d="M2 2.5A2.5 2.5 0 0 1 4.5 0h8.75a.75.75 0 0 1 .75.75v12.5a.75.75 0 0 1-.75.75h-2.5a.75.75 0 0 1 0-1.5h1.75v-2h-8a1 1 0 0 0-.714 1.7.75.75 0 1 1-1.072 1.05A2.495 2.495 0 0 1 2 11.5Zm10.5-1h-8a1 1 0 0 0-1 1v6.708A2.486 2.486 0 0 1 4.5 9h8ZM5 12.25a.25.25 0 0 1 .25-.25h3.5a.25.25 0 0 1 .25.25v3.25a.25.25 0 0 1-.4.2l-1.45-1.087a.249.249 0 0 0-.3 0L5.4 15.7a.25.25 0 0 1-.4-.2Z"></path> </svg> <span data-view-component="true" class="text-normal"> antoinezambelli /</span> forge

Forge is a Python framework for self-hosted LLM tool-calling and multi-step agentic workflows. Available as open-source on GitHub.

AI Agents Multi-agent Open source

SIG

HYP

GitHub Trending·May 21

<svg aria-hidden="true" data-component="Octicon" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-repo mr-1 tmp-mr-1 color-fg-muted"> <path d="M2 2.5A2.5 2.5 0 0 1 4.5 0h8.75a.75.75 0 0 1 .75.75v12.5a.75.75 0 0 1-.75.75h-2.5a.75.75 0 0 1 0-1.5h1.75v-2h-8a1 1 0 0 0-.714 1.7.75.75 0 1 1-1.072 1.05A2.495 2.495 0 0 1 2 11.5Zm10.5-1h-8a1 1 0 0 0-1 1v6.708A2.486 2.486 0 0 1 4.5 9h8ZM5 12.25a.25.25 0 0 1 .25-.25h3.5a.25.25 0 0 1 .25.25v3.25a.25.25 0 0 1-.4.2l-1.45-1.087a.249.249 0 0 0-.3 0L5.4 15.7a.25.25 0 0 1-.4-.2Z"></path> </svg> <span data-view-component="true" class="text-normal"> teng-lin /</span> notebooklm-py

Unofficial Python API for Google NotebookLM providing full programmatic access to features, including those not exposed in web UI. Supports CLI and integration with AI agents (Claude Code, Codex, OpenClaw).

DeepMind AI Agents Code generation

SIG

HYP

GitHub Trending·May 21

<svg aria-hidden="true" data-component="Octicon" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-repo mr-1 tmp-mr-1 color-fg-muted"> <path d="M2 2.5A2.5 2.5 0 0 1 4.5 0h8.75a.75.75 0 0 1 .75.75v12.5a.75.75 0 0 1-.75.75h-2.5a.75.75 0 0 1 0-1.5h1.75v-2h-8a1 1 0 0 0-.714 1.7.75.75 0 1 1-1.072 1.05A2.495 2.495 0 0 1 2 11.5Zm10.5-1h-8a1 1 0 0 0-1 1v6.708A2.486 2.486 0 0 1 4.5 9h8ZM5 12.25a.25.25 0 0 1 .25-.25h3.5a.25.25 0 0 1 .25.25v3.25a.25.25 0 0 1-.4.2l-1.45-1.087a.249.249 0 0 0-.3 0L5.4 15.7a.25.25 0 0 1-.4-.2Z"></path> </svg> <span data-view-component="true" class="text-normal"> aiming-lab /</span> AutoResearchClaw

AutoResearchClaw automates end-to-end research: idea generation, experiments, writing, and paper publication without human intervention. Fully autonomous and self-evolving AI agent system.

AI Agents Multi-agent Papers

SIG

HYP

GitHub Trending·May 21

<svg aria-hidden="true" data-component="Octicon" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-repo mr-1 tmp-mr-1 color-fg-muted"> <path d="M2 2.5A2.5 2.5 0 0 1 4.5 0h8.75a.75.75 0 0 1 .75.75v12.5a.75.75 0 0 1-.75.75h-2.5a.75.75 0 0 1 0-1.5h1.75v-2h-8a1 1 0 0 0-.714 1.7.75.75 0 1 1-1.072 1.05A2.495 2.495 0 0 1 2 11.5Zm10.5-1h-8a1 1 0 0 0-1 1v6.708A2.486 2.486 0 0 1 4.5 9h8ZM5 12.25a.25.25 0 0 1 .25-.25h3.5a.25.25 0 0 1 .25.25v3.25a.25.25 0 0 1-.4.2l-1.45-1.087a.249.249 0 0 0-.3 0L5.4 15.7a.25.25 0 0 1-.4-.2Z"></path> </svg> <span data-view-component="true" class="text-normal"> openai /</span> whisper

OpenAI Whisper is a speech recognition model trained on 680,000 hours of multilingual weakly supervised data. The GitHub repository includes code, pre-trained models, and performance benchmarks across multiple languages and acoustic conditions.

OpenAI Voice Open source

SIG

HYP

Reddit r/LocalLLaMA·May 21

'Am I OpenAI compatible' - a tool and documentation for unified api signatures in open source AI.

Tool and documentation to check OpenAI compatibility of open-source projects (vLLM, llama.cpp, etc.). Documents official and unofficial signatures, with extensions for other model types. Useful for integrating LLM endpoints into applications or building proxies/middleware.

Open source Tools Infrastructure

SIG

HYP

Hacker News (AI)·May 21

Google officially announces that ads will be included in AI Mode search results

Google officially announces ads will be integrated into AI Mode search results. This monetization of generative search responses marks a strategic shift for the tech giant amid competition from chatbots.

DeepMind Business

SIG

HYP

Le Big Data·May 21

Free, Orange et EDF s’allient pour créer une AI Gigafactory en France

Free, Orange, EDF and major French digital players join forces to build an AI Gigafactory in France. Initiative aimed at developing computing capacity and domestic AI infrastructure.

Infrastructure Business

SIG

HYP

Vercel AI Blog·May 21

Pull anomaly alert details using the Vercel CLI

Vercel adds anomaly alert access via CLI with `vercel alerts` command. The `--ai` option displays AI investigation results for each alert. Available on Observability Plus.

Tools AI Agents Infrastructure

SIG

HYP

Hacker News (AI)·May 21

The famous O3 "GeoGuessr" prompt did not work

The viral O3 'GeoGuessr' prompt fails to deliver promised results. Testing reveals the widely-shared technique does not work as claimed on OpenAI's model.

OpenAI Prompt engineering Evals

SIG

HYP

Reddit r/LocalLLaMA·May 21

One Night Werewolf played by LLMs

A user built a custom UI to play One Night Werewolf with LLMs (Gemma 31B/26B, Qwen 3.6 36B, 27B model). Models initially struggled accepting identity swaps; goal-oriented prompting improved performance. A runner script compatible with OpenAI API now enables gameplay without tool-call requirements.

AI Agents Prompt engineering Open source

SIG

HYP

Reddit r/LocalLLaMA·May 21

AMD Powers Next-Generation Agent Computers with New Ryzen AI Halo Developer Platform and Ryzen AI Max PRO 400 Series Processors

AMD launches Ryzen AI Halo Developer Platform and Ryzen AI Max PRO 400 Series processors for next-generation agent computers. Official announcement detailing availability of Halo Box and AI 400 series.

AI Agents Infrastructure

SIG

HYP

Le Big Data·May 21

Mistral AI se renforce dans l’industrie européenne avec le rachat de Emmi AI

Mistral AI acquires Austrian startup Emmi AI to strengthen its presence in European industry. This acquisition accelerates the French group's expansion strategy in the continental market.

Mistral Business

SIG

HYP

Latent Space·May 21

[AINews] OpenAI GPT-next disproves 80 year old Erdős planar unit distance problem for under $1000

OpenAI GPT-next solved the 80-year-old Erdős planar unit distance problem for under $1000. Significant result at the intersection of AI and mathematics.

GPT OpenAI Reasoning

SIG

HYP

Le Big Data·May 21

Universal Cart : Comment Google compte enfin court-circuiter Amazon

Google launches Universal Cart, a shopping experience powered by Gemini, to compete with Amazon. The platform unifies shopping across Google's services.

Gemini Business

SIG

HYP

Vercel AI Blog·May 21

Qwen 3.7 Max now available on Vercel AI Gateway

Qwen 3.7 Max from Alibaba is now available on Vercel AI Gateway. The model, designed as an agent foundation, excels at frontend prototyping, multi-file engineering, and office workflow automation through multi-agent orchestration.

Qwen AI Agents Multi-agent

SIG

HYP

Reddit r/LocalLLaMA·May 21

Model Golf for some Runpod Credits!

CompactAI-O launches monthly 'Model Golf' competition for models under 100M parameters. Winner receives $50 RunPod credits monthly. Open competition for builders.

Open source Tools Benchmarks

SIG

HYP

Le Big Data·May 21

Ask YouTube et Ask Maps : La fin de la recherche par mots-clés est-elle actée ?

Google launches Ask YouTube and Ask Maps, conversational AI-powered search tools. These features gradually replace traditional keyword-based search with AI-generated answers.

DeepMind AI Agents

SIG

HYP

Reddit r/MachineLearning·May 21

High E2E latency on fine-tuned Gemma 4 26B despite low TTFT [R]

User reports high E2E latency (3-5s) on fine-tuned Gemma 4 26B despite low TTFT (100-300ms) on H100 with vLLM and FP8 quantization. Exploring optimizations: speculative decoding (EAGLE/Medusa), draft models, or bottleneck investigation.

Gemini Fine-tuning Infrastructure

SIG

HYP

Reddit r/LocalLLaMA·May 21

Qwen3.6 27B and llama.cpp appreciation post

User praises Qwen3.6 27B quantized Q5_K_XL on llama.cpp with dual RX 9070 XT GPUs. Model excels at debugging complex code (distributed backend services), achieving 398 tokens/s prompt eval and 46.9 tokens/s generation. Strong agentic capabilities despite low quantization.

Qwen Code generation AI Agents

SIG

HYP

Reddit r/LocalLLaMA·May 21

Same task in github-copilot, pi, claude-code, and opencode with Qwen3.6 27B

Empirical comparison of four coding agent harnesses (GitHub Copilot, Pi, Claude Code, OpenCode) with Qwen 3.6 27B on identical tasks. Qwen excels with Claude Code and OpenCode (4 requests to create pelican.svg) but struggles with GitHub Copilot (13 requests). OpenCode provides internet search and interactive widget generation.

Code generation AI Agents Qwen

SIG

HYP

Le Big Data·May 21

IA et performance : le verdict de l’indice mondial Fivetran

Fivetran releases a global index showing that despite massive budgets (tens of millions of euros), deploying agentic AI faces significant performance obstacles.

AI Agents Benchmarks Business

SIG

HYP

Le Big Data·May 21

LinkedIn : fin des posts qui puent l’IA, le grand ménage a commencé

LinkedIn fights AI-generated posts by detecting and reducing their visibility. The platform strengthens filters to limit auto-generated content and artificial motivational phrases.

AI safety Regulation

SIG

HYP

Reddit r/LocalLLaMA·May 21

Training a vision model from scratch on iPod touch 4 images

A user trains a DCGAN model from scratch on 350 images of a red Solo cup taken with an iPod touch 4 under varying lighting and backgrounds. Goal: capture sensor-specific artifacts from the device. Generated images resemble DALL-E 2022 output.

Image generation Open source

SIG

HYP

Reddit r/MachineLearning·May 21

Masked Diffusion Language Models are Strong and Steerable Text-Based World Models for Agentic RL [R]

Masked diffusion language models (MDLMs) outperform autoregressive LLMs as world models for agentic RL. Fine-tuned SDAR-8B and WeDLM-8B achieve 4x gains on BLEU-1/ROUGE-L/MAUVE. GRPO training yields +15% absolute task-success on ScienceWorld, ALFWorld, AppWorld with Qwen3, Mistral, LFM2.5 in zero-shot transfer.

AI Agents Reinforcement learning Reasoning

SIG

HYP

arXiv cs.LG·May 21

Conformal Selective Acting: Anytime-Valid Risk Control for RLVR-Trained LLMs

CSA (Conformal Selective Acting) is a deployment wrapper for RLVR-fine-tuned LLMs guaranteeing per-round risk control without pooling across deployments. Tested on 480 specialist streams and 10,300 Expert-Iteration rounds with LoRA, CSA maintains a Ville e-process per threshold and achieves selective-risk bound R_T^act ≤ α+O(N_T^{-1/2}) with anytime pathwise validity.

Reinforcement learning AI safety Evals

SIG

HYP

May 2026

SAP taps Mistral AI to help customers migrate legacy software

Heretic has been served a legal notice by Meta, Inc.

Honesty in a small model drops from 35% to 0% by changing the tone of the prompt. Sharing the findings.

LlamaStation v0.9 — llama.cpp GUI for Windows with multi-backend support, TurboQuant, MTP and more

LLM planner - pick a rig for your use-case/model/budget, or pick models for your rig. 60+ builds, 50+ models, 130+ cited t/s sources, 150+ reviewer YouTube videos, idle+active watts, multi-region prices, regular updates.

Qwen3.7 Max : l’IA d’Alibaba écrase ses anciens scores sur les benchmarks IA

Anthropic to open Milan office, expanding push into Europe

Gemini randomly dumped its system prompt

L’IA, la donnée et le piège de la vitesse : quand l’efficacité néglige la fiabilité

Jensen Huang identifie un nouveau marché IA à 200 milliards $ pour Nvidia

I did what Microsoft wouldn't - updated POML VS Code extension

Tencent Hy 30B/7B/1.8B

AdventHealth advances whole-person care with OpenAI

CPPL: A Circuit Prompt Programming Language

Nexos.ai : on a testé l’outil qui veut convaincre votre DSI que l’IA n’est pas une passoire

Anthropic pourrait dépenser 1,25 milliard $ par mois sur l’infrastructure xAI

110 tok/s with 12GB VRAM on Qwen3.6 35B A3B and ik_llama.cpp

'Am I OpenAI compatible' - a tool and documentation for unified api signatures in open source AI.

Google officially announces that ads will be included in AI Mode search results

Free, Orange et EDF s’allient pour créer une AI Gigafactory en France

Pull anomaly alert details using the Vercel CLI

The famous O3 "GeoGuessr" prompt did not work

One Night Werewolf played by LLMs

AMD Powers Next-Generation Agent Computers with New Ryzen AI Halo Developer Platform and Ryzen AI Max PRO 400 Series Processors

Mistral AI se renforce dans l’industrie européenne avec le rachat de Emmi AI

[AINews] OpenAI GPT-next disproves 80 year old Erdős planar unit distance problem for under $1000

Universal Cart : Comment Google compte enfin court-circuiter Amazon

Qwen 3.7 Max now available on Vercel AI Gateway

Model Golf for some Runpod Credits!

Ask YouTube et Ask Maps : La fin de la recherche par mots-clés est-elle actée ?

High E2E latency on fine-tuned Gemma 4 26B despite low TTFT [R]

Qwen3.6 27B and llama.cpp appreciation post

Same task in github-copilot, pi, claude-code, and opencode with Qwen3.6 27B

IA et performance : le verdict de l’indice mondial Fivetran

LinkedIn : fin des posts qui puent l’IA, le grand ménage a commencé

Training a vision model from scratch on iPod touch 4 images

Masked Diffusion Language Models are Strong and Steerable Text-Based World Models for Agentic RL [R]

Conformal Selective Acting: Anytime-Valid Risk Control for RLVR-Trained LLMs

Long-Context Reasoning Through Proxy-Based Chain-of-Thought Tuning

Shiny Stories, Hidden Struggles: Investigating the Representation of Disability Through the Lens of LLMs

Pseudo-Siamese Network for Planning in Target-Oriented Proactive Dialogues

Refining and Reusing Annotation Guidelines for LLM Annotation

Mix-Quant: Quantized Prefilling, Precise Decoding for Agentic LLMs

Do as I Say, Not as I Do: Instruction-Induction Conflict in LLMs

Puzzled By ChatGPT? No more! A Jigsaw Puzzle to Promote AI Literacy and Awareness

SCRIBE: Diagnostic Evaluation and Rich Transcription Models for Indic ASR

Mechanics of Bias and Reasoning: Interpreting the Impact of Chain-of-Thought Prompting on Gender Bias in LLMs

When Irregularity Helps: A Subclass Analysis of Inductive Bias in Neural Morphology

Direct Translation between Sign Languages

HRM-Text: Efficient Pretraining Beyond Scaling

Retrieval-Augmented Long-Context Translation for Cultural Image Captioning: Gators submission for AmericasNLP 2026 shared task

On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists

DIVE: Embedding Compression via Self-Limiting Gradient Updates

Distributional Alignment as a Criterion for Designing Task Vectors in In-Context Learning

The Illusion of Intervention: Your LLM-Simulated Experiment is an Observational Study

Assessing socio-economic climate impacts from text data

Generative Recursive Reasoning

Neural Estimation of Pairwise Mutual Information in Masked Discrete Sequence Models

Geometry-Lite: Interpretable Safety Probing via Layer-Wise Margin Geometry

GROW: Aligning GRPO with State-Action Modeling for Open-World VLM Agents

Graph Transductive Sharpening: Leveraging Unlabeled Predictions in Node Classification

Physics-informed convolutional neural networks for fluid flow through porous media

Instance Discrimination for Link Prediction

Chronicle: A Multimodal Foundation Model for Joint Language and Time Series Understanding

Smaller Abstract State Spaces Enable Cross-Scale Generalization in Reinforcement Learning

OmniISR: A Unified Framework for Centralized and Federated Learning via Intermediate Supervision and Regularization

Plug-and-Play Spiking Operators: Breaking the Nonlinearity Bottleneck in Spiking Transformers

Closed-form predictive coding via hierarchical Gaussian filters

Less Data, Faster Training: repeating smaller datasets speeds up learning via sampling biases

Leveraging Large Language Models for Sentiment Analysis: Multi-Modal Analysis of Decentraland's MANA Token

Improving Quantized Model Performance in Qualitative Analysis with Multi-Pass Prompt Verification

Parallel LLM Reasoning for Bias-Resilient, Robust Conceptual Abstraction

Data Scaling as Progressive Coverage of a Predictive Contribution Spectrum

MedicalBench: Evaluating Large Language Models Toward Improved Medical Concept Extraction

FlowLM: Few-Step Language Modeling via Diffusion-to-Flow Adaptation

Synchronization and Turn-Taking in Full-Duplex Speech Dialogue Models

When Reasoning Supervision Hurts: TTCW-Based Long-Form Literary Review Generation

DEL: Digit Entropy Loss for Numerical Learning of Large Language Models

Stage-Audit: Auditable Source-Frontier Discovery for Cross-Wiki Tables