Back to feed
OpenAI Blog·

Introducing GPT-5.4

Signal
85
Hype
35
In three linesOpenAI releases GPT-5.4, its most capable and efficient frontier model for professional work, with state-of-the-art coding, computer use, tool search, and 1M-token context window.

**Context**

OpenAI is releasing GPT-5.4, positioned as its "most capable and efficient frontier model for professional work." The announcement lands inside an accelerated release cadence: GPT-4o, GPT-4.5, and GPT-5 were all shipped within a few months of each other, and the .4 versioning signals continuous iteration rather than annual cycles. The timing is deliberate: Anthropic recently pushed Claude 3.7 Sonnet with extended thinking, Google consolidated Gemini 2.5 Pro at the top of coding benchmarks, and Meta opened Llama 4 as open-weight. GPT-5.4 responds across all fronts simultaneously, targeting four explicit verticals: code generation, computer use, tool search, and long context.

The "efficient" framing matters as much as "capable." Since GPT-4, OpenAI has faced sustained criticism over inference costs that slow large-scale production adoption. GPT-5.4 appears to address that pressure by pairing SOTA performance with improved cost-per-token — a balance that Claude 3.5 Haiku and Gemini 2.0 Flash have already begun establishing as the market baseline.

**Key Facts**

- **1M-token context window**: 4x the 256k-token maximum of GPT-4o, reaching parity with Gemini 1.5 Pro/2.5 Pro, which had held this figure as a commercial differentiator since early 2024. - **Coding SOTA**: state-of-the-art coding performance claimed — no specific benchmark cited in the available extract, but the implicit target is SWE-bench Verified, where GPT-4o plateaued around 33% and Claude 3.7 Sonnet recently reached ~62%. - **Computer Use**: GUI interaction capability, a feature Claude 3.5 introduced in October 2024 where OpenAI had a visible gap. - **Tool Search**: dynamic selection and retrieval across large tool catalogs — critical for multi-step agentic architectures where the model must choose among dozens of available functions. - **"Professional" positioning**: explicit enterprise/pro segment targeting, distinct from consumer use cases, likely anticipating differentiated API and ChatGPT Team/Enterprise pricing tiers. - **Efficiency**: the "efficient" label implies an improved performance-to-cost ratio; no API pricing figures are available in the source extract.

**Why It Matters**

GPT-5.4 closes two gaps OpenAI had visibly ceded: long context (dominated by Google since Gemini 1.5) and computer use (introduced by Anthropic). Patching both in a single release reduces the primary argument that pushed engineering teams toward multi-vendor integrations. The most immediate loser is Anthropic: Claude 3.7 Sonnet had become the de facto reference for agentic coding, and GPT-5.4 attacks that positioning directly. Google loses its 1M-token differentiator, which it had held exclusively for over a year. For wrapper and orchestration vendors — LangChain, LlamaIndex, Cursor, Cognition — a model with native tool search compresses the value of their abstraction layers, a signal worth tracking on their product roadmaps.

**Who This Actually Affects**

Developers building code agents (Devin-style, PR automation, large-scale refactoring) now have a competitive OpenAI option against Claude 3.7 without having to trade ecosystem fit for performance. The 1M-token context unlocks concrete use cases: ingesting entire codebases (an average startup repo fits in ~500k tokens), processing long transcripts, or running RAG pipelines without chunking. For enterprise teams already on Azure OpenAI Service, native integration without infrastructure migration is a direct operational advantage. On the other hand, founders who built product differentiation solely around access to Claude or Gemini need to reassess their moat: if GPT-5.4 genuinely reaches parity on coding and computer use, "we use the best model" becomes a weaker standalone competitive argument.

Read source
Your take?
GPTOpenAICode generationAI Agents

Summary generated by Claude — human-verified