Intermediate
Mistral Models
Mistral has built a family of models spanning general reasoning, coding, vision, and embedding โ from ultra-cheap (Nemo) to frontier quality (Large 3, Medium 3.5). This page maps each model to its best use case.
Model Families
General Purpose
Mistral Large 3
API onlymistral-large-latest ยท 128K context ยท $2/$6 per 1M
Mistral's flagship general model. Best reasoning, instruction following, and multilingual performance. Use for complex analysis, long documents, and agentic tasks where cost is secondary.
Mistral Medium 3.5
API onlymistral-medium-latest ยท 128K context ยท $1.50/$7.50 per 1M
Frontier-class model positioned between Large and Small. Higher output cost than Large โ designed for tasks where generation quality matters more than input volume. Strong agentic capabilities.
Mistral Small 4
Apache 2.0 open-weightmistral-small-latest ยท 128K context ยท $0.15/$0.60 per 1M
The cost-efficient workhorse. 90% cheaper than Large 3 for tasks that don't need frontier quality. Summarization, classification, drafting, Q&A. Available for download and local use.
Mistral Nemo 12B
Apache 2.0 open-weightopen-mistral-nemo ยท 128K context ยท $0.02/~$0.06 per 1M
Ultra-cheap. Joint release with NVIDIA. Use for high-volume, simple tasks where cost is the primary constraint โ e.g., bulk classification or simple reformatting.
Coding Specialists
Codestral
Apache 2.0 open-weightcodestral-latest ยท 256K context ยท $0.30/$0.90 per 1M
Mistral's dedicated code model. 256K context โ the largest context of any Mistral model, designed for large codebases and long code generation. Fill-in-the-middle (FIM) support for code completion. 80+ programming languages.
Devstral Small
API onlydevstral-small ยท 128K context ยท $0.07/$0.28 per 1M
Agentic coding specialist. Built for multi-file edits, tool use, and coding agent tasks. Very low cost โ the cheapest model in the market optimized specifically for agentic coding workflows. Outperforms Codestral Small on SWE-bench.
Multimodal & Embedding
Pixtral Large
API onlypixtral-large-latest ยท 128K context ยท $2.00/$6.00 per 1M
Mistral's vision model. Accepts images + text. Strong at document understanding, chart reading, screenshot analysis. Same price as Large 3 โ essentially Large 3 with vision added.
Mistral Embed
API onlymistral-embed ยท 8K context ยท $0.10/โ per 1M
Text embedding model for RAG pipelines. 1024-dimensional embeddings. Lower context than specialized embedding models (e.g. nomic-embed-text at 8K) โ use for short-to-medium documents.
Quick Selection Guide
| Task | Recommended | Why |
|---|---|---|
| Complex reasoning / analysis | Mistral Large 3 | Best quality across all tasks |
| Agentic workflows | Mistral Medium 3.5 or Large 3 | Strongest tool use + planning |
| Code generation | Codestral | 256K context, FIM support, 80+ languages |
| Agentic coding (cheap) | Devstral Small | Cheapest capable agentic coder โ $0.07/$0.28 |
| High-volume, simple tasks | Mistral Small 4 or Nemo | 90% cheaper than Large 3 |
| Image understanding | Pixtral Large | Only Mistral model with vision |
| RAG embeddings | Mistral Embed | Native Mistral embeddings โ or use nomic-embed-text locally |
| Local self-hosted | Mistral Small 4 / Codestral / Nemo | All three are Apache 2.0 open weights |
Checklist: Do You Understand This?
- Can you name Mistral's flagship model and its price per million tokens?
- Do you know which Mistral model to use for: bulk classification, agentic coding, image analysis?
- Can you name three Mistral models that are available as open-weight downloads?
- Do you understand why Codestral has 256K context specifically?