Beginner

Comparing Claude Models

The three Claude tiers differ on three axes: speed, cost, and quality. Choosing correctly matters — defaulting to Opus for everything wastes money; defaulting to Haiku for everything sacrifices quality. This page gives you the decision matrix.

The Three Axes

Speed (Latency)

Haiku > Sonnet > Opus. Haiku typically responds 3–5× faster than Opus for the same prompt. In real-time chat or voice pipelines, latency is often the binding constraint.

Cost

Haiku is the cheapest by a significant margin. Running Opus costs roughly 10–20× more per million tokens than Haiku. At scale, this difference is substantial.

Quality

Opus > Sonnet > Haiku. But the gap between Sonnet and Haiku is larger than the gap between Opus and Sonnet for most common tasks. Sonnet gives most of the quality at a fraction of the cost.

Decision Matrix

Task Type	Recommended Tier	Reason
Classification, labelling, routing	Haiku	Simple output (a label or short decision), runs millions of times — cost dominates
Short extractions (dates, names, entities)	Haiku	Structured, well-defined output; quality ceiling is low — Haiku is sufficient
Summarisation (short documents)	Haiku	Straightforward compression task; Haiku handles it well at 1/10th the cost
Coding assistance, bug fixing	Sonnet	Needs genuine code understanding; Sonnet is excellent on SWE-bench; Haiku misses edge cases
Writing, editing, drafting	Sonnet	Quality of output matters; Sonnet produces noticeably better prose than Haiku
Analysis, research synthesis	Sonnet	Multi-step reasoning required; Sonnet handles most analytical tasks reliably
Agentic workflows (tool use, multi-step)	Sonnet	Claude Sonnet is the model used by Claude Code; best-in-class agentic reliability
Long-document analysis (100K+ tokens)	Sonnet	All tiers share the 200K context window; Sonnet provides better long-context accuracy than Haiku
Complex maths, formal proofs	Opus + extended thinking	Requires deep reasoning steps; extended thinking budget helps significantly
Deep research synthesis	Opus	Where Opus earns its cost: synthesising complex information across many sources
Creative writing, nuanced voice	Opus	Opus produces the most stylistically aware and contextually sensitive long-form output

Cost Comparison at Scale

Model	Context	Input ($/1M)	Output ($/1M)	Speed	Best For
Haiku 4.5	200K	$1.00	$5.00	Fastest	Classification, routing, extraction, customer-facing chat
Sonnet 4.6	1M	$3.00	$15.00	Fast	Coding, writing, analysis, agents — default production choice
Opus 4.6	1M	$5.00	$25.00	Slower	Hardest reasoning, research synthesis, maximum quality

Anthropic publishes per-token pricing at anthropic.com/pricing. The relative cost ratios across the current generation are roughly:

1×

Haiku 4.5

Baseline cost

~5–8×

Sonnet 4.6

vs Haiku

~15–20×

Opus 4.6

vs Haiku

For a pipeline processing 10 million tokens per day: running on Haiku vs Sonnet saves roughly $100–150/day at scale. Check current rates before budgeting — pricing has consistently fallen across model generations.

Quality on Key Benchmarks

Anthropic publishes benchmark results for each model. Key dimensions where tier selection matters most:

SWE-bench (coding): Sonnet is Anthropic's benchmark leader here; even Haiku performs reasonably on simple coding tasks. Opus adds headroom on the hardest bugs.
MMLU (knowledge): All current models perform well. Sonnet and Opus are close on straightforward knowledge retrieval; gaps emerge on multi-step reasoning chains.
GSM8K / MATH (maths): Extended thinking on Opus closes the gap to frontier reasoning models; standard Sonnet is solid for most applied maths.
Long-context recall: Claude consistently outperforms same-tier competitors on "needle in a haystack" retrieval from large contexts — this is a cross-tier strength.

A Practical Approach

When building a new feature or pipeline, use this sequence:

Prototype with Sonnet. It's fast enough to iterate quickly and capable enough to validate your approach without hitting Haiku's quality ceiling.
If quality is fine, try Haiku. Run a batch of representative prompts through Haiku and compare outputs. If quality holds, switch and save the cost.
Only escalate to Opus if Sonnet consistently fails on your hardest test cases and the task is genuinely worth the 5–20× cost premium.
Consider task decomposition before escalating. Often a complex task that "needs Opus" can be broken into subtasks where Sonnet handles most of the work, Opus handles only the hard step, and Haiku handles the simple steps — at an overall cost closer to Sonnet.

Checklist: Do You Understand This?

The three axes for model selection are speed, cost, and quality — they trade off against each other
Haiku for classification, routing, extraction, and high-volume simple tasks
Sonnet for coding, writing, analysis, agentic workflows — the default for most production workloads
Opus for the hardest reasoning tasks and research synthesis where quality matters most
Opus costs roughly 15–20× more than Haiku; Sonnet is 5–8× Haiku
Prototype with Sonnet, validate with Haiku, escalate to Opus only when needed