August 7, 2025News

Introducing GPT-5

GPT-5 OpenAI artificial intelligence model illustration
Reed Vogt

Reed Vogt

CEO and Head Engineer

12 min read

Introducing GPT-5

Our smartest, fastest, most useful model yet, with built-in thinking that puts expert-level intelligence in everyone's hands.

GPT-5 Overview

We are introducing GPT‑5, our best AI system yet. GPT‑5 is a significant leap in intelligence over all our previous models, featuring state-of-the-art performance across coding, math, writing, health, visual perception, and more. It is a unified system that knows when to respond quickly and when to think longer to provide expert-level responses. GPT‑5 is available to all users, with Plus subscribers getting more usage, and Pro subscribers getting access to GPT‑5 pro, a version with extended reasoning for even more comprehensive and accurate answers.

One unified system

GPT‑5 is a unified system with a smart, efficient model that answers most questions, a deeper reasoning model (GPT‑5 thinking) for harder problems, and a real‑time router that quickly decides which to use based on conversation type, complexity, tool needs, and your explicit intent (for example, if you say "think hard about this" in the prompt). The router is continuously trained on real signals, including when users switch models, preference rates for responses, and measured correctness, improving over time. Once usage limits are reached, a mini version of each model handles remaining queries. In the near future, we plan to integrate these capabilities into a single model.

GPT-5 Unified System Architecture

A smarter, more widely useful model

GPT‑5 not only outperforms previous models on benchmarks and answers questions more quickly, but—most importantly—is more useful for real-world queries. We've made significant advances in reducing hallucinations, improving instruction following, and minimizing sycophancy, while leveling up GPT‑5's performance in three of ChatGPT's most common uses: writing, coding, and health.

Coding

GPT‑5 is our strongest coding model to date. It shows particular improvements in complex front‑end generation and debugging larger repositories. It can often create beautiful and responsive websites, apps, and games with an eye for aesthetic sensibility in just one prompt, intuitively and tastefully turning ideas into reality. Early testers also noted its design choices, with a much better understanding of things like spacing, typography, and white space.

Here are some examples of what GPT‑5 has created with just one prompt:

Jumping Ball Runner Game Example

Prompt: Create a single-page app in a single HTML file with the following requirements:\n\n- Name: Jumping Ball Runner\n- Goal: Jump over obstacles to survive as long as possible.\n- Features: Increasing speed, high score tracking, retry button, and funny sounds for actions and events.\n- The UI should be colorful, with parallax scrolling backgrounds.\n- The characters should look cartoonish and be fun to watch.\n- The game should be enjoyable for everyone.

Creative expression and writing

GPT‑5 is our most capable writing collaborator yet, able to help you steer and translate rough ideas into compelling, resonant writing with literary depth and rhythm. It more reliably handles writing that involves structural ambiguity, such as sustaining unrhymed iambic pentameter or free verse that flows naturally, combining respect for form with expressive clarity. These improved writing capabilities mean that ChatGPT is better at helping you with everyday tasks like drafting and editing reports, emails, memos, and more.

Writing comparison between GPT-4o and GPT-5

Health

GPT‑5 is our best model yet for health-related questions, empowering users to be informed about and advocate for their health. The model scores significantly higher than any previous model on HealthBench, an evaluation we published earlier this year based on realistic scenarios and physician-defined criteria. Compared to previous models, it acts more like an active thought partner, proactively flagging potential concerns and asking questions to give more helpful answers.

GPT-5 Health Capabilities Comparison

Evaluations

GPT‑5 is much smarter across the board, as reflected by its performance on academic and human-evaluated benchmarks, particularly in math, coding, visual perception, and health. It sets a new state of the art across math (94.6% on AIME 2025 without tools), real-world coding (74.9% on SWE-bench Verified, 88% on Aider Polyglot), multimodal understanding (84.2% on MMMU), and health (46.2% on HealthBench Hard)—and those gains show up in everyday use.

GPT-5 Performance Benchmarks

Coding Performance

Coding Performance Benchmarks

Instruction following and agentic tool use

GPT‑5 shows significant gains in benchmarks that test instruction following and agentic tool use, the kinds of capabilities that let it reliably carry out multi-step requests, coordinate across different tools, and adapt to changes in context. In practice, this means it's better at handling complex, evolving tasks; GPT‑5 can follow your instructions more faithfully and get more of the work done end-to-end using the tools at its disposal.

Instruction Following and Tool Use Benchmarks

Multimodal

The model excels across a range of multimodal benchmarks, spanning visual, video-based, spatial, and scientific reasoning. Stronger multimodal performance means ChatGPT can reason more accurately over images and other non-text inputs—whether that's interpreting a chart, summarizing a photo of a presentation, or answering questions about a diagram.

Multimodal Performance Benchmarks

Health Performance

Health Performance Benchmarks

Economically important tasks

GPT‑5 is also our best performing model on an internal benchmark measuring model performance on complex, economically valuable knowledge work. When using reasoning, GPT‑5 is comparable to or better than experts in roughly half the cases, while outperforming o3 and ChatGPT Agent across tasks spanning over 40 occupations including law, logistics, sales, and engineering.

Economic Value Benchmarks

Faster, more efficient thinking

GPT‑5 gets more value out of less thinking time. In our evaluations, GPT‑5 (with thinking) performs better than OpenAI o3 with 50-80% less output tokens across capabilities, including visual reasoning, agentic coding, and graduate-level scientific problem solving.

Thinking Efficiency Comparison

GPT‑5 was trained on Microsoft Azure AI supercomputers.

Building a more robust, reliable, and helpful model

More accurate answers to real-world queries

GPT‑5 is significantly less likely to hallucinate than our previous models. With web search enabled on anonymized prompts representative of ChatGPT production traffic, GPT‑5's responses are ~45% less likely to contain a factual error than GPT‑4o, and when thinking, GPT‑5's responses are ~80% less likely to contain a factual error than OpenAI o3.

Hallucination Reduction Metrics

More honest responses

Alongside improved factuality, GPT‑5 (with thinking) more honestly communicates its actions and capabilities to the user—especially for tasks which are impossible, underspecified, or missing key tools. In order to achieve a high reward during training, reasoning models may learn to lie about successfully completing a task or be overly confident about an uncertain answer.

Honesty and Deception Metrics

Safer, more helpful responses

GPT‑5 advances the frontier on safety. In the past, ChatGPT relied primarily on refusal-based safety training: based on the user's prompt, the model should either comply or refuse. While this type of training works well for explicitly malicious prompts, it can struggle to handle situations where the user's intent is unclear, or information could be used in benign or malicious ways.

Safety and Helpfulness Metrics

Reducing sycophancy and refining style

Overall, GPT‑5 is less effusively agreeable, uses fewer unnecessary emojis, and is more subtle and thoughtful in follow‑ups compared to GPT‑4o. It should feel less like "talking to AI" and more like chatting with a helpful friend with PhD‑level intelligence.

Sycophancy Reduction Metrics

More ways to customize ChatGPT

GPT‑5 is significantly better at instruction following, and we see a corresponding improvement in its ability to follow custom instructions. We're also launching a research preview of four new preset personalities for all ChatGPT users, made possible by the improvements on steerability.

Custom Personalities Preview

Comprehensive safeguards for biological risk

We decided to treat the "GPT‑5 thinking" model as High capability in the Biological and Chemical domain, and have implemented strong safeguards to sufficiently minimize the associated risks. We rigorously tested the model with our safety evaluations under our Preparedness Framework, completing 5,000 hours of red-teaming with partners like the CAISI and UK AISI.

Biological Risk Safeguards

GPT‑5 pro

For the most challenging, complex tasks, we are also releasing GPT‑5 pro, replacing OpenAI o3‑pro, a variant of GPT‑5 that thinks for ever longer, using scaled but efficient parallel test-time compute, to provide the highest quality and most comprehensive answers. GPT‑5 pro achieves the highest performance in the GPT‑5 family on several challenging intelligence benchmarks, including state-of-the-art performance on GPQA, which contains extremely difficult science questions.

GPT-5 Pro Performance Comparison

In evaluations on over 1000 economically valuable, real-world reasoning prompts, external experts preferred GPT‑5 pro over "GPT‑5 thinking" 67.8% of the time. GPT‑5 pro made 22% fewer major errors and excelled in health, science, mathematics, and coding.

How to use GPT‑5

GPT‑5 is the new default in ChatGPT, replacing GPT‑4o, OpenAI o3, OpenAI o4-mini, GPT‑4.1, and GPT‑4.5 for signed-in users. Just open ChatGPT and type your question; GPT‑5 handles the rest, applying reasoning automatically when the response would benefit from it. Paid users can still select "GPT‑5 Thinking" from the model picker, or type something like 'think hard about this' in the prompt to ensure reasoning is used when generating a response.

GPT-5 Usage Interface

Availability and access

GPT‑5 is starting to roll out today to all Plus, Pro, Team, and Free users, with access for Enterprise and Edu coming in one week. Pro, Plus, and Team users can also start coding with GPT‑5 in the Codex CLI by signing in with ChatGPT.

As with GPT‑4o, the difference between free and paid access to GPT‑5 is usage volume. Pro subscribers get unlimited access to GPT‑5, and access to GPT‑5 Pro. Plus users can use it comfortably as their default model for everyday questions, with significantly higher usage than free users. Team, Enterprise, and Edu customers can also use GPT‑5 comfortably as their default model for everyday work, with generous limits that make it easy for entire organizations to rely on GPT‑5.

GPT-5 Access Tiers

For ChatGPT free-tier users, full reasoning capabilities may take a few days to fully roll out. Once free users reach their GPT‑5 usage limits, they will transition to GPT‑5 mini, a smaller, faster, and highly capable model.

The Future of AI

GPT‑5 represents a monumental leap forward in AI capabilities, bringing us closer to the vision of artificial general intelligence. With its unified architecture, advanced reasoning, and comprehensive safety measures, GPT‑5 is designed to be the most helpful, honest, and harmless AI assistant yet created.