Intermediate

Models in ChatGPT

ChatGPT gives you access to a family of models spanning instant responses, deep reasoning, and long-context document analysis. Understanding the differences between them — and which plan unlocks each — lets you choose the right tool for each task rather than defaulting to the most expensive option.

Quick Reference — Model Comparison

Model	Context	API Input ($/1M)	API Output ($/1M)	Access	Best For
GPT-5 Instant	128K	—	—	Free / Go / Plus / Pro	Everyday tasks, drafting, fast responses
GPT-5 Thinking	196K	—	—	Plus / Pro / Team / Enterprise	Complex reasoning, analysis, detailed tasks
GPT-5 Pro	1M	—	—	Pro only ($200/mo)	Maximum quality, research-grade tasks
o3 (API only)	200K	$10.00	$40.00	API only	Highest reasoning quality, agentic pipelines
o4-mini (API only)	100K	$1.10	$4.40	API only	Cost-efficient reasoning at scale

ChatGPT interface pricing is subscription-based (no per-token billing). API pricing as of early 2026 — verify at openai.com/api/pricing. GPT-5 tier prices are not separately published for the API; the relevant API models are listed separately at openai.com/api/pricing.

A Note on Parameters

OpenAI does not publicly disclose parameter counts for any model in the GPT-4 or GPT-5 generation. Reported figures in the press are estimates or leaks and should not be treated as authoritative. What OpenAI does characterise is relative capability, speed, and cost — and those are the dimensions that matter for practical use.

The GPT-5 Family

GPT-5 is the current foundation model generation powering ChatGPT. OpenAI offers it in three tiers, each representing a different compute budget:

GPT-5 Instant

The fastest and most cost-efficient variant. Targets a 128K token context window. Available on Free and Go tiers. Handles everyday tasks — drafting, summarisation, Q&A, light coding — with low latency. Not suitable for tasks requiring deep multi-step reasoning or exhaustive analysis.

GPT-5 Thinking

The extended reasoning variant. Targets a 196K token context window. Available on Plus and above. Before producing its response, the model performs an internal chain-of-thought process — breaking down problems, exploring approaches, and self-checking conclusions. This produces noticeably better results on complex reasoning, maths, legal analysis, and technical writing, at the cost of higher latency and token consumption.

GPT-5 Pro

The maximum compute variant, exclusive to the Pro plan. Used for the hardest research-grade tasks where accuracy and depth matter more than speed. Powers the most demanding Deep Research sessions and the highest-quality outputs across all task types. Not available on any other plan.

o-Series Reasoning Models

Status update (Feb 2026): o3 and o4-mini were retired as selectable model options in the ChatGPT interface (web, iOS, Android, Mac, Windows) in February 2026. Reasoning capability is now surfaced as GPT-5 Thinking in the ChatGPT interface. Both o3 and o4-mini remain fully available via the OpenAI API — if you are building applications, they are still the recommended choice for reasoning-intensive tasks.

The o-series models are architecturally distinct from GPT-5. Rather than generating a response immediately, they allocate additional "thinking time" — producing a long internal chain-of-thought before arriving at an answer. This makes them significantly stronger on formal logic, mathematics, code correctness, and structured analysis, but slower and more expensive per query.

The full reasoning model. 200K context window. Via API. Excels at olympiad-level mathematics, complex code debugging, scientific reasoning, and multi-step logical proofs. Supports native tool use during reasoning — can call web search, code execution, and file analysis from within its thinking trace.

o4-mini

A faster, cheaper reasoning model. 100K context window. Via API. Still significantly stronger than GPT-5 Instant on reasoning tasks. At $1.10/$4.40 per 1M tokens, it is roughly 9× cheaper than o3 while matching or exceeding o3 on most benchmarks. The default choice for reasoning tasks in API-based applications.

GPT-4o and GPT-4.1

GPT-4o remains available as OpenAI's foundational multimodal model (text, image, audio, vision) with a 128K context window. In ChatGPT it has largely been superseded by GPT-5 Instant for most tasks, but it is still widely used via the API where its pricing ($2.50 per 1M input tokens) is significantly lower than GPT-5.

GPT-4.1 is an API-only model not surfaced in the ChatGPT interface. Its defining characteristic is a 1M token context window, making it the specialist for long-context tasks: entire codebases, lengthy legal documents, book-length manuscripts. Standard pricing, no reasoning overhead, but exceptional at tasks requiring broad document comprehension.

Plan Availability Summary

Model	Context	Available On	Primary Use
GPT-5 Instant	128K	Free, Go, Plus, Pro	Everyday tasks, fast responses
GPT-5 Thinking	196K	Plus, Pro, Team, Enterprise	Complex reasoning, detailed analysis
GPT-5 Pro	Large	Pro only	Maximum quality, research-grade output
o3	200K	Plus+	Formal logic, maths, code correctness
o4-mini	100K	Plus+	Fast reasoning, Deep Research lightweight
GPT-4o	128K	API (all); ChatGPT (limited)	Multimodal tasks, API cost efficiency
GPT-4.1	1M	API only	Long-context document analysis

How Reasoning Models Differ

Standard GPT-5 models respond by predicting the most likely next token given the conversation history — a fast, fluent process. Reasoning models (o3, o4-mini) insert a deliberate thinking phase before generating output. This internal monologue is hidden from users but visible in the API response's reasoning_tokens count. The model may explore multiple solution paths, catch errors in its own reasoning, and revise its approach before producing the final answer. This is why o3 can solve problems that GPT-5 Instant fails on, even though both are "large language models."

The practical implication: use GPT-5 Instant for speed-sensitive tasks (chat, drafting, summarisation), switch to o3 or GPT-5 Thinking when accuracy matters more than latency (debugging, proofs, complex analysis), and reserve GPT-5 Pro for the most demanding research-grade outputs.

Checklist

What distinguishes GPT-5 Thinking from GPT-5 Instant beyond the context window difference?
How does the o-series reasoning process differ architecturally from GPT-5?
Which model is available exclusively on the Pro plan and why?
What makes GPT-4.1 useful despite being superseded by GPT-5 in most tasks?
When would you choose o4-mini over full o3 for a reasoning task?