May 2026

Llama Code generation Benchmarks

llama.cpp MTP support landed - Qwen3.6 27B at 2.44× on a Strix Halo, 2.17× on a RTX 3090 rig

MTP (speculative decoding) support merged into llama.cpp (PR #22673, May 16). Qwen 3.6 27B benchmarks: 1.81×–2.44× speedup on Strix Halo (ROCm), 1.54×–2.17× on RTX 3090. MoE 35B-A3B shows smaller gains (1.24×–1.40×). Enable with --spec-type draft-mtp --spec-draft-n-max N.

SIG

HYP

Hacker News (AI)·May 18

Agora-1: The Multi-Agent World Model

Agora-1 is a multi-agent world model capable of simulating complex interactions between multiple agents. The system generates emergent behaviors and realistic dynamics in virtual environments.

Multi-agent Reasoning Papers

SIG

HYP

Google DeepMind·May 18

Fast-tracking genetic leads to reverse cellular aging

Google DeepMind uses Co-Scientist, an AI agent, to identify genetic factors that successfully rejuvenate human cells. Researchers discovered novel genes involved in cellular aging processes.

DeepMind AI Agents Papers

SIG

HYP

Hacker News (AI)·May 18

We let AIs run radio stations

Researchers let AI systems autonomously operate radio stations in real-time. The experiment tests models' ability to make independent decisions, manage content, and interact with listeners in a dynamic, uncontrolled environment.

AI Agents Reasoning

SIG

HYP

Hacker News (AI)·May 18

Elon Musk has lost his lawsuit against Sam Altman and OpenAI

Elon Musk lost his lawsuit against Sam Altman and OpenAI. The ruling dismisses Musk's claims regarding OpenAI's transition to a for-profit structure.

OpenAI

SIG

HYP

The Decoder·May 18

Cursor's Composer 2.5 matches Opus 4.7 and GPT-5.5 benchmarks at a fraction of the cost

Cursor releases Composer 2.5, a coding model built on Kimi K2.5 and trained on 25x more synthetic tasks than its predecessor. It matches Opus 4.7 and GPT-5.5 benchmark performance at a fraction of the cost.

Code generation Benchmarks Kimi

SIG

HYP

Hacker News (AI)·May 18

Anthropic acquires Stainless

Anthropic acquires Stainless, a startup focused on SDK generation and developer tools. The acquisition strengthens Anthropic's infrastructure and tooling capabilities for developers using Claude.

Anthropic Claude Tools

SIG

HYP

Hugging Face Blog·May 18

Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation

Hugging Face releases a guide for fine-tuning NVIDIA Cosmos Predict 2.5, a robot video generation model, using LoRA/DoRA. The method reduces GPU resource requirements while maintaining generation quality for specialized robotics use cases.

Fine-tuning Video generation Robotics

SIG

HYP

Hacker News (AI)·May 18

Show HN: InsForge – Open-source Heroku for coding agents

InsForge is an open-source Heroku-like platform for deploying and managing coding agents. It streamlines agent orchestration in production with built-in infrastructure and monitoring.

AI Agents Code generation Open source

SIG

HYP

Hacker News (AI)·May 18

We stopped AI bot spam in our GitHub repo using Git's –author flag

A team blocked AI bot spam in their GitHub repository by leveraging Git's --author flag to filter suspicious commits. Simple but effective technique against unwanted automated contributions.

Open source Tools

SIG

HYP

Hugging Face Blog·May 18

PaddleOCR 3.5: Running OCR and Document Parsing Tasks with a Transformers Backend

PaddleOCR 3.5 integrates a Transformers backend for OCR and document parsing tasks. The new version improves accuracy and flexibility by leveraging Transformers models, enabling better text recognition and structured data extraction.

Open source Vision Tools

SIG

HYP

Hugging Face Blog·May 18

The Open Agent Leaderboard

Hugging Face launches a public leaderboard to evaluate open-source AI agents. The platform ranks models by their ability to complete complex tasks, with reproducible benchmarks and transparent results.

AI Agents Benchmarks Open source

SIG

HYP

Hacker News (AI)·May 18

When Fast Fourier Transform Meets Transformer for Image Restoration

Research combining Fast Fourier Transform (FFT) with Transformer architectures for image restoration. Hybrid approach leveraging frequency-domain processing and attention mechanisms to improve reconstruction quality.

Vision Papers

SIG

HYP

Latent Space·May 18

The Next War Is Already Here. The West Isn't Ready. — Yaroslav Azhnyuk, The Fourth Law & Guest Host Noah Smith, Noahpinion

Ukrainian drone founder Yaroslav Azhnyuk transitioned from pet cameras to AI-guided weapons. With Noah Smith, he argues the West is unprepared for the ongoing technological war.

AI Agents AI safety Regulation

SIG

HYP

Reddit r/MachineLearning·May 18

Sub-JEPA: a simple fix to LeCun group's LeWorldModel that consistently improves performance [P]

Sub-JEPA improves LeWorldModel (LeCun's group, NYU) by applying Gaussian regularization within frozen random orthogonal subspaces instead of globally. Gains up to +10.7 pp on Two-Room, straighter latent trajectories, better physical state decodability. Code and paper released.

Reasoning Papers Benchmarks

SIG

HYP

The Decoder·May 18

A Stanford student reflects on his ChatGPT class and a culture of "just a little bit of fraud"

A Stanford student describes how ChatGPT transformed an existing culture of academic dishonesty into the default norm in his graduating class. AI amplified fraud practices already present at the elite university.

GPT AI safety Regulation

SIG

HYP

Reddit r/MachineLearning·May 18

Reviving PapersWithCode (by Hugging Face) [P]

Hugging Face revives PapersWithCode using AI agents to automatically parse papers and generate leaderboards. Site features trending papers, domain categorization, eval results (Qwen 3.5, RF-DETR, DINOv3), leaderboards (MMTEB, COCO), citation counts, linked GitHub repos, and external paper support (DeepSeek v4). Live at paperswithcode.co.

AI Agents Benchmarks Open source

SIG

HYP

Reddit r/MachineLearning·May 18

Scaling LLMs horizontally: hidden-state coupling without weight modification [R]

Residual Coupling (RC) connects frozen language models in parallel via lightweight learned linear projections, without weight modification. Linear bridges read hidden states from one model and inject additive updates into another's residual stream. On medical data, RC reduces perplexity to 11.02 vs 56.80 for MoE (+80.7%), and improves TruthfulQA by 9.1 percentage points.

Llama Multi-agent Fine-tuning

SIG

HYP

Benchmarks AI safety Alignment

I tested 42 LLMs on their willingness to build the apocalypse. The "safest" closed-source models are lying to you.

DystopiaBench tests 42 LLMs (open and closed-source) on their ability to refuse progressively normalized dangerous requests. 6 dystopia categories (autonomous weapons, surveillance, behavioral control, etc.) with 5 escalation levels. Finding: models detect obvious harmful requests but fail against requests hidden behind dual-use and normalization. Open-source benchmark available.

SIG

HYP

Hacker News (AI)·May 18

AI eats the world (Spring 26) [pdf]

Analysis report on AI penetration across economic and technology sectors in Spring 2026. PDF document synthesizing trends, adoption rates, and measurable impacts of generative and specialized AI.

Benchmarks Business

SIG

HYP

The Decoder·May 18

MAGA-aligned groups want government oversight of frontier AI models

A coalition of conservative organizations led by Humans First calls on President Trump to issue an executive order mandating safety testing for frontier AI models before deployment.

Regulation AI safety

SIG

HYP

The Decoder·May 18

Anthropic to brief global financial regulators on cyber flaws found by Claude Mythos

Anthropic will brief leading finance ministries and central banks on cyber vulnerabilities in the global financial system's defenses uncovered by its Claude Mythos Preview model.

Claude AI safety Regulation

SIG

HYP

Hacker News (AI)·May 18

Voice AI Systems Are Vulnerable to Hidden Audio Attacks

Researchers demonstrate that voice AI systems are vulnerable to hidden audio attacks (adversarial examples). These inaudible attacks can fool models and compromise the security of voice assistants.

AI safety Voice

SIG

HYP

Qwen Code generation Benchmarks

Qwen 3.6 27B on 24GB VRAM setup: backend comparisons, quant choice and settings (llama.cpp, ik_llama.cpp, BeeLlama, vllm)

Detailed benchmark of Qwen 3.6 27B on RTX 3090 24GB. ik_llama.cpp outperforms llama.cpp and BeeLlama with 1261 tok/s prefill and 72.9 tok/s decode on 156k context. Optimal setup: IQ4_KS quantization, multi-token prediction, flash attention.

SIG

HYP

OpenAI Blog·May 18

OpenAI and Dell partner to bring Codex to hybrid and on-premise enterprise environments

OpenAI and Dell partner to deploy Codex in hybrid and on-premise enterprise environments. The partnership enables organizations to securely launch AI coding agents across proprietary data and workflows.

OpenAI Claude Code AI Agents

SIG

HYP

The Decoder·May 18

AI startup revenue hits $80 billion, but Anthropic and OpenAI take almost all of it

AI startups generated $80 billion in revenue, but Anthropic and OpenAI capture 89% of it. Market concentration remains extreme among top players.

Anthropic OpenAI Business

SIG

HYP

AI Agents Code generation Open source

I built a coding agent that gets 87% on benchmarks with a 4B parameter model, here's how

SmallCode, a local coding agent, achieves 87% on benchmarks with Gemma 4B using compound tools, iterative improvement loops, and optimized context management. Unlike existing agents (OpenCode, Cursor, Claude Code) requiring large models, SmallCode is designed for small local models with optional escalation to Claude/OpenAI.

SIG

HYP

Hacker News (AI)·May 17

Long-term editing of brain circuits using an engineered electrical synapse

Researchers developed an engineered electrical synapse enabling long-term editing of brain circuits. The approach uses modified gap junctions to durably control neuronal activity without repeated intervention.

AI safety

SIG

HYP

Hacker News (AI)·May 17

Autoregressive next token prediction and KV Cache in transformers

Technical article on autoregressive next token prediction and KV Cache mechanism in transformers. Explains fundamentals of language model inference.

Reasoning

SIG

HYP

Google DeepMind·May 17

Simulate real-world places with Project Genie and Street View

Google DeepMind expands access to Google AI Ultra subscribers globally and introduces a new capability powered by Street View to simulate real-world places.

DeepMind Video generation

SIG

HYP

Google DeepMind·May 17

Introducing Gemini Omni

Google DeepMind introduces Gemini Omni, a multimodal model processing text, audio, video, and images as native inputs and outputs. The model delivers ultra-low latency and improved performance on reasoning and vision benchmarks.

Gemini DeepMind Vision

SIG

HYP

Simon Willison·May 17

GDS weighs in on the NHS's decision to retreat from Open Source

The UK Government Digital Service (GDS) published guidance on May 14th recommending public sector organisations keep open source as default, implicitly criticising the NHS's decision to close repositories following vulnerabilities found via Project Glasswing. GDS argues closure increases costs and reduces reusability.

Open source AI safety

SIG

HYP

Google DeepMind·May 17

Gemini for Science: AI experiments and tools for a new era of discovery

Google DeepMind releases Gemini for Science, a suite of AI tools and experiments designed to accelerate scientific research by expanding the scale and precision of scientific exploration.

DeepMind Gemini Tools

SIG

HYP

Reddit r/MachineLearning·May 17

Recent Developments in LLM Architectures: KV Sharing, mHC, and Compressed Attention [P]

Discussion of recent LLM architecture advances: KV sharing, mHC mechanisms, and compressed attention. Exploration of optimizations to reduce memory consumption and improve computational efficiency of language models.

Reasoning Infrastructure

SIG

HYP

Reddit r/MachineLearning·May 17

Program misleading high school students into paying to perform academic misconduct in ML Research [D]

A paid program (Algoverse AI Research) marketed to high school students produces mass NeurIPS 2025 submissions (289 claimed acceptances) with obvious errors: duplicate results, abstracts contradicting findings, AI-generated citations, unreviewed datasets. Kevin Zhu, program-affiliated, lists 158 publications and 468 coauthors on OpenReview.

Papers Evals Regulation

SIG

HYP

Interconnects (Nathan Lambert)·May 16

Latest open artifacts (#21): Open model bonanza! Gemma 4, DeepSeek V4, Kimi K2.6, MiMo 2.5, GLM-5.1 & others. On CAISI's V4 assessment.

Busy month with multiple flagship releases: Gemma 4, DeepSeek V4, Kimi K2.6, MiMo 2.5, GLM-5.1. Nathan Lambert also covers CAISI's V4 assessment of these open-source models.

Gemini DeepSeek Kimi

SIG

HYP

Google DeepMind·May 16

Strengthening Singapore’s AI Future: A New National Partnership

Google DeepMind partners with Singapore to deploy frontier AI addressing health, education, and sustainability challenges. National partnership to tackle complex problems.

DeepMind Business

SIG

HYP

Google DeepMind·May 16

Finding the molecular switches behind new infectious diseases

Clare Bryant uses Co-Scientist, a Google DeepMind AI tool, to identify genetic triggers in emerging infectious diseases. The approach combines computational analysis with biological expertise to accelerate discovery of molecular mechanisms.

DeepMind AI Agents Papers

SIG

HYP

Google DeepMind·May 16

Opening new paths in aging research

Calico Life Sciences uses Co-Scientist, a Google DeepMind AI tool, to connect scattered findings and generate new leads in aging research.

DeepMind AI Agents RAG

SIG

HYP

Google DeepMind·May 16

Accelerating discovery of liver disease mechanisms

Filippo Menolascina uses Google DeepMind's Co-Scientist to accelerate discovery of liver disease mechanisms and identify new treatments. The tool helps explain why existing drugs only work in certain patients.

DeepMind AI Agents Reasoning

SIG

HYP

Google DeepMind·May 16

Uniting biological toolkits for a new approach to ALS

Google DeepMind partners with Boston Children's Hospital and MIT to develop novel RNA-based treatments for ALS by combining biological toolkits and research approaches.

DeepMind AI safety

SIG

HYP

Google DeepMind·May 16

Uncovering repurposed medicines to fight liver fibrosis

A Stanford geneticist uses Google DeepMind's Co-Scientist to identify existing medicines that could treat liver fibrosis. The AI tool helps discover candidate molecules from approved treatments.

DeepMind AI Agents Tools

SIG

HYP

Latent Space·May 16

[AINews] Cerebras' $60B IPO: Slowly, then All at Once

Cerebras announces a $60B IPO. The AI chip specialist accelerates commercial expansion after years of technology development.

Infrastructure

SIG

HYP

Google DeepMind·May 16

How WeatherNext helped the National Hurricane Center better predict Hurricane Melissa’s historic landfall in Jamaica

Google DeepMind's WeatherNext AI model improved forecasts for Hurricane Melissa in Jamaica. The model gave National Hurricane Center forecasters additional lead time to warn communities ahead of the historic landfall.

DeepMind Benchmarks

SIG

HYP

OpenAI Blog·May 16

OpenAI and Malta partner to bring ChatGPT Plus to all citizens

OpenAI and Malta partner to provide ChatGPT Plus access to all citizens with training programs on practical AI skills and responsible use. No pricing, timeline, or implementation details disclosed.

Claude Tools Code generation

SIG

HYP

Simon Willison·May 15

inaturalist-clumper 0.1

Simon Willison releases inaturalist-clumper 0.1, an open-source tool to cluster and publish iNaturalist sightings on his blog. Running in production for several weeks.

Open source Tools

SIG

HYP

Simon Willison·May 15

QR code generator

Simon Willison built a QR code generator with Claude's help. The tool supports URLs, text, and WiFi connections (SSID, password, WPA/WPA2/WPA3 security). Style options include square shape, border, size, and custom color.

SIG

HYP

Simon Willison·May 15

datasette-llm-limits 0.1a0

Release of datasette-llm-limits 0.1a0, a plugin for Datasette enabling per-user or global spending limits for LLM usage. Supports daily limits with rolling windows and USD amounts.

Tools Open source

SIG

HYP

Simon Willison·May 15

datasette-agent 0.1a2

Release of datasette-agent 0.1a2 with permission system. Background agent tools now require the new `datasette-agent-background` permission. Tool availability can be attached to required permissions.

AI Agents Tools Open source

SIG

HYP

Vercel AI Blog·May 15

Sort providers by cost, latency, or throughput on AI Gateway

Vercel AI Gateway now enables sorting providers by cost, time to first token (TTFT), or throughput (TPS). Sorting happens at request time, automatically reflecting price changes and performance shifts without code changes. Compatible with Zero Data Retention and existing routing options.

Tools Infrastructure Business

SIG

HYP

OpenAI Blog·May 15

Databricks brings GPT-5.5 to enterprise agent workflows

Databricks integrates GPT-5.5 into enterprise agent workflows following the model's performance on OfficeQA Pro benchmark. No specific metrics or improvement figures disclosed.

GPT AI Agents OpenAI

SIG

HYP

Vercel AI Blog·May 15

Use native curl syntax with Vercel CLI

Vercel CLI now supports native curl syntax. The command accepts full URLs, bare hostnames, and the --url flag, using Vercel auth to bypass Deployment Protection.

SIG

HYP

OpenAI Blog·May 15

How data science teams use Codex

OpenAI showcases Codex use cases for data science teams: automated generation of root-cause analysis briefs, impact reports, KPI memos, scoped analyses, and dashboard specifications from real work inputs.

OpenAI Tools

SIG

HYP

OpenAI Blog·May 15

How business operations teams use Codex

OpenAI showcases Codex use cases for business operations teams: automated generation of initiative briefs, strategy updates, leadership decision packets, and progress reports from real work inputs.

OpenAI Tools Business

SIG

HYP

OpenAI Blog·May 15

A new personal finance experience in ChatGPT

OpenAI launches personal finance feature in ChatGPT Pro (US market). Users can securely connect bank accounts to receive AI-powered insights and guidance tailored to their financial context, goals, and priorities.

SIG

HYP

OpenAI Blog·May 15

How sales teams use Codex

OpenAI showcases Codex use cases for sales teams: automated generation of pipeline briefs, meeting prep packets, forecast reviews, account plans, and stalled-deal diagnostics from real business data inputs.

SIG

HYP

Vercel AI Blog·May 15

Trace any Vercel request from the CLI

Vercel adds OpenTelemetry trace generation via CLI. Commands `vercel curl --trace` and `vercel traces get` enable trace generation and retrieval by request ID. Available on all plans.

AI Agents Code generation

SIG

HYP

Simon Willison·May 14

Not so locked in any more

Coding agents lower the maintenance cost of legacy apps, enabling companies to migrate to technologies like React Native without fear of lock-in. Mitchell Hashimoto notes that programming languages are no longer lock-in: a wrong technology decision can be corrected by an AI-assisted rewrite.

SIG

HYP

Latent Space·May 14

AI-Native Healthcare: 100M Doctor Visits, 10–20 Hours Saved, Prior Auth in Minutes — Janie Lee & Chai Asawa, Abridge

Abridge converts patient-clinician conversations into healthcare's operating system. Platform processes 100M doctor visits, saves 10-20 hours per clinician, and reduces prior authorization to minutes using AI.

AI Agents Voice Business

SIG

HYP

Simon Willison·May 14

datasette-agent 0.1a1

Release of datasette-agent 0.1a1. Version now uses `execute-sql` permission when deciding which tables to list to the user.

AI Agents Tools Open source

SIG

HYP

OpenAI Blog·May 14

Sea's View on the Future of Agentic Software Development with Codex

Sea Limited is deploying Codex across engineering teams to accelerate AI-native software development in Asia, according to the company's CPO. The strategy addresses regional productivity and scalability challenges. No impact metrics or timeline disclosed in the excerpt.

Claude Code OpenAI AI Agents

SIG

HYP

Hugging Face Blog·May 14

Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality

IBM and Hugging Face release Granite Embedding Multilingual R2, an open-source embedding model under Apache 2.0 license. The model supports 32K token context and delivers best-in-class retrieval quality for sub-100M parameter models across multiple languages.

Embeddings Open source RAG

SIG

HYP

OpenAI Blog·May 14

Work with Codex from anywhere

OpenAI integrates Codex into ChatGPT mobile app, enabling real-time monitoring, steering, and approval of coding tasks across devices and remote environments.

OpenAI Claude Code Tools

SIG

HYP

Simon Willison·May 14

datasette-ip-rate-limit 0.1a0

Release of datasette-ip-rate-limit 0.1a0, a configurable rate-limiting plugin for Datasette. Built with Codex (GPT-5.5 xhigh) to block aggressive crawlers. Production config on datasette.io with per-path rules (60 requests/60s, 20s block).

Tools Open source Code generation

SIG

HYP

Latent Space·May 14

[AINews] Codex Rises, Claude Meters Programmatic Usage

Long-term trend of major coding agents. Codex rises back into focus. Claude meters programmatic usage with detailed metrics.

Claude Code generation AI Agents

SIG

HYP

Hugging Face Blog·May 14

Unlocking asynchronicity in continuous batching

Hugging Face introduces an asynchronicity technique for optimizing continuous batching in inference servers. The method improves throughput by handling requests non-blockingly, reducing latency and increasing GPU resource utilization.

Infrastructure Tools Open source

SIG

HYP

OpenAI Blog·May 14

Helping ChatGPT better recognize context in sensitive conversations

OpenAI improves ChatGPT's context awareness in sensitive conversations through safety updates. The system now better detects risks over time and responds more safely. No technical details or impact metrics disclosed.

AI safety

SIG

HYP

Vercel AI Blog·May 14

Protected Source Maps: Ship browser source maps securely

Vercel introduces Protected Source Maps, restricting access to production .map files via Vercel Authentication. Authorized teams can fetch them for debugging minified code; others receive a 404. Enabled by default for new projects.

OpenAI Code generation Tools

SIG

HYP

Simon Willison·May 13

Welcome to the Datasette blog

Datasette launches an official blog built with OpenAI Codex desktop. Simon Willison used the Markdown session transcript export feature to document the build process.

SIG

HYP

OpenAI Blog·May 13

Building a safe, effective sandbox to enable Codex on Windows

OpenAI built a secure sandbox for Codex on Windows, enabling safe coding agents with controlled file access and network restrictions. No technical details or benchmarks provided in the excerpt.

Claude Code OpenAI AI safety

SIG

HYP

Simon Willison·May 13

CSP Allow-list Experiment

Simon Willison presents an experimental tool that loads an app in a CSP-protected sandboxed iframe with a custom fetch() intercepting CSP errors and passing them to the parent window to dynamically add domains to the allow-list. Built with GPT-5.5 xhigh in Codex.

Tools Code generation

SIG

HYP

OpenAI Blog·May 13

Our response to the TanStack npm supply chain attack

OpenAI details its response to the TanStack "Mini Shai-Hulud" supply chain attack, outlines protections for systems and signing certificates, and mandates macOS app updates by June 12, 2026. Incident affecting software security with strengthened defenses against evolving supply chain threats.

OpenAI AI safety

SIG

HYP

Vercel AI Blog·May 13

Trusted Sources for Deployment Protection

Vercel introduces Trusted Sources, a security mechanism using short-lived OIDC tokens to authorize protected deployments without sharing long-lived secrets. Vercel projects and external services (GitHub Actions, etc.) can be authorized via customizable from/to rules per environment.

Infrastructure AI safety Tools

SIG

HYP

Simon Willison·May 12

datasette 1.0a29

Datasette 1.0a29 adds TokenRestrictions.abbreviated() utility method, improves table header visibility for empty tables, fixes Mobile Safari column actions dialog bug, and resolves a race condition between Datasette.close() and Database.close() causing segfaults.

Open source Tools Infrastructure

SIG

HYP

Vercel AI Blog·May 12

Create Vercel Firewall rules with natural language

Vercel Firewall now enables creating custom WAF rules using natural language. Users describe desired behavior and the dashboard generates the rule. Available via dashboard or Vercel CLI.

SIG

HYP

Simon Willison·May 12

llm 0.32a2

llm 0.32a2 adds support for OpenAI's `/v1/responses` endpoint for reasoning-capable models (GPT-5 class). Displays summarized reasoning tokens in distinct color. Use `-R` or `--hide-reasoning` flags to hide them.

OpenAI Reasoning Tools

SIG

HYP

Interconnects (Nathan Lambert)·May 12

How open model ecosystems compound

Analysis of China's high-participation, open-first AI ecosystem. Reflections on compounding effects and innovation dynamics within this decentralized model.

Open source Business

SIG

HYP

OpenAI Blog·May 12

How finance teams use Codex

OpenAI showcases Codex use cases for finance teams: building MBRs, reporting packs, variance bridges, model checks, and planning scenarios from real work inputs. No quantified results or benchmarks provided.

Gemini Multi-agent AI Agents

SIG

HYP

Google DeepMind·May 12

Co-Scientist: A multi-agent AI partner to accelerate research

Google DeepMind introduces Co-Scientist, a multi-agent AI partner built with Gemini to accelerate scientific breakthroughs as a collaborative research assistant.

SIG

HYP

Vercel AI Blog·May 12

Fast mode for Opus 4.7 available on AI Gateway

Vercel AI Gateway launches Fast Mode for Claude Opus 4.7 in research preview. Output token generation is ~2.5x faster while maintaining full Opus 4.7 intelligence. Pricing: 6x standard rates (input $30/1M, output $150/1M tokens).

Claude Claude Code Infrastructure

SIG

HYP

Latent Space·May 12

[AINews] Thinking Machines' Native Interaction Models - TML-Interaction-Small 276B-A12B - advances SOTA Realtime Voice and kills standard VAD

Thinking Machines releases TML-Interaction-Small, a 276B parameter model with 12B active parameters, advancing SOTA in realtime voice and eliminating the need for standard Voice Activity Detection.

Voice AI Agents

SIG

HYP

Vercel AI Blog·May 12

Manage Vercel Firewall in the CLI

Vercel adds Firewall management via CLI. New `vercel firewall` command enables configuration of custom rules, IP blocks, system bypasses, and attack modes from the terminal. An agent skill documents safe rollout best practices.

Tools AI Agents Infrastructure

SIG

HYP

Vercel AI Blog·May 12

AI Gateway production index

Vercel releases production index from 7 months of AI Gateway traffic (200K+ teams). April 2026 spend: Anthropic 61%, Google 21%, OpenAI 12%; token volume: Google 38%, Anthropic 26%, OpenAI 13%, xAI 10%. Premium models (Claude Opus) dominate high-stakes workloads, cheap fast models (Gemini Flash) drive volume.

Benchmarks Claude Gemini

SIG

HYP

Vercel AI Blog·May 12

Node.js 26.x now available on Vercel Sandboxes

Vercel Sandbox now supports Node.js 26. Users must upgrade @vercel/sandbox to 1.10.22 or 0.0-beta.19 (v2) and set runtime property to node26.

Infrastructure Tools

SIG

HYP

OpenAI Blog·May 12

AutoScout24 scales engineering with AI-powered workflows

AutoScout24 Group uses Codex and ChatGPT to speed development cycles and improve code quality. The article describes AI tool adoption for software engineering without disclosing specific metrics or measurable impact.

Claude Code OpenAI Business

SIG

HYP

OpenAI Blog·May 12

What Parameter Golf taught us about AI-assisted research

OpenAI's Parameter Golf competition gathered 1,000+ participants and 2,000+ submissions to explore AI-assisted ML research, coding agents, quantization, and novel model design under strict constraints. The initiative demonstrates how AI tools accelerate research innovation.

OpenAI AI Agents Benchmarks

SIG

HYP

OpenAI Blog·May 12

How NVIDIA engineers and researchers build with Codex

OpenAI showcases how NVIDIA engineers use Codex with GPT-5.5 to deploy production systems and convert research ideas into runnable experiments.

Claude Code OpenAI GPT

SIG

HYP

Simon Willison·May 11

Thoughts on GitLab's workforce reduction" and "structural and strategic decisions"

GitLab announces workforce reduction and restructuring for the agentic era: cutting 30% of countries with small teams (out of ~60), flattening organization (removing 3 management layers), and reorganizing R&D into ~60 smaller autonomous teams with end-to-end ownership.

AI Agents Business

SIG

HYP

Hugging Face Blog·May 11

Building Blocks for Foundation Model Training and Inference on AWS

Hugging Face and AWS collaborate to provide optimized building blocks for foundation model training and inference on AWS infrastructure, including SageMaker integrations and open-source tools.

Infrastructure Open source Tools

SIG

HYP

Simon Willison·May 11

Quoting James Shore

James Shore argues that AI coding agents must reduce maintenance costs, not just accelerate output. Doubling productivity without halving maintenance costs creates permanent technical debt: 2× output + 1× maintenance cost = 2× total costs.

AI Agents Code generation

SIG

HYP

Simon Willison·May 11

Your AI Use Is Breaking My Brain

Jason Koebler criticizes the proliferation of AI-generated content online and its cognitive toll. He introduces the term "Zombie Internet": a mix of bots, humans using AI, and automated agents generating spam content for monetization (YouTube, blogs, social media). Filtering this pollution becomes mentally exhausting and distorts human writing styles.

AI safety Regulation

SIG

HYP

Simon Willison·May 11

Using LLM in the shebang line of a script

Simon Willison documents using LLM in script shebang lines. The LLM CLI supports fragments (-f), tool calls (-T), and YAML templates defining Python functions. Examples: generate SVG, write haiku with current time, or run calculations with gpt-5.4-mini.

Tools Code generation Prompt engineering

SIG

HYP

Simon Willison·May 11

Learning on the Shop floor

Tobias Lütke describes River, Shopify's internal coding agent tool, operating entirely in public on Slack. River declines direct messages and enforces public channel conversations, creating an osmosis learning environment where all employees observe work and learn from each other without formal curriculum.

AI Agents Code generation Tools

SIG

HYP

OpenAI Blog·May 11

How ChatGPT adoption broadened in early 2026

ChatGPT saw rapid adoption growth in Q1 2026, fastest among users over 35 with more balanced gender distribution. Data signals mainstream AI adoption beyond early adopters.

SIG

HYP

OpenAI Blog·May 11

OpenAI Campus Network: Student club interest form

OpenAI launches a global student club network to access AI tools, host events, and build campus communities. Interest form now open for student clubs worldwide.

SIG

HYP

OpenAI Blog·May 11

How enterprises are scaling AI

OpenAI publishes guidance on scaling AI in enterprises, covering governance, workflow design, and quality assurance. The article emphasizes trust and structured processes to move from early experiments to sustained impact. No specific models or metrics disclosed in the excerpt.

SIG

HYP

Vercel AI Blog·May 11

Automate progressive rollouts with Vercel Flags

Vercel Flags now supports progressive rollouts, enabling feature deployment to a growing percentage of users on a predefined schedule. Unlike fixed weighted splits, each stage has a target percentage and duration, catching regressions on a small user slice before full rollout. Available via dashboard and CLI.