Intermediate

Mistral Models

Mistral has built a family of models spanning general reasoning, coding, vision, and embedding — from ultra-cheap (Nemo) to frontier quality (Large 3, Medium 3.5). This page maps each model to its best use case.

Model Families

General Purpose

Mistral Large 3

API only

mistral-large-latest · 128K context · $2/$6 per 1M

Mistral's flagship general model. Best reasoning, instruction following, and multilingual performance. Use for complex analysis, long documents, and agentic tasks where cost is secondary.

Mistral Medium 3.5

API only

mistral-medium-latest · 128K context · $1.50/$7.50 per 1M

Frontier-class model positioned between Large and Small. Higher output cost than Large — designed for tasks where generation quality matters more than input volume. Strong agentic capabilities.

Mistral Small 4

Apache 2.0 open-weight

mistral-small-latest · 128K context · $0.15/$0.60 per 1M

The cost-efficient workhorse. 90% cheaper than Large 3 for tasks that don't need frontier quality. Summarization, classification, drafting, Q&A. Available for download and local use.

Mistral Nemo 12B

Apache 2.0 open-weight

open-mistral-nemo · 128K context · $0.02/~$0.06 per 1M

Ultra-cheap. Joint release with NVIDIA. Use for high-volume, simple tasks where cost is the primary constraint — e.g., bulk classification or simple reformatting.

Coding Specialists

Codestral

Apache 2.0 open-weight

codestral-latest · 256K context · $0.30/$0.90 per 1M

Mistral's dedicated code model. 256K context — the largest context of any Mistral model, designed for large codebases and long code generation. Fill-in-the-middle (FIM) support for code completion. 80+ programming languages.

Devstral Small

API only

devstral-small · 128K context · $0.07/$0.28 per 1M

Agentic coding specialist. Built for multi-file edits, tool use, and coding agent tasks. Very low cost — the cheapest model in the market optimized specifically for agentic coding workflows. Outperforms Codestral Small on SWE-bench.

Multimodal & Embedding

Pixtral Large

API only

pixtral-large-latest · 128K context · $2.00/$6.00 per 1M

Mistral's vision model. Accepts images + text. Strong at document understanding, chart reading, screenshot analysis. Same price as Large 3 — essentially Large 3 with vision added.

Mistral Embed

API only

mistral-embed · 8K context · $0.10/— per 1M

Text embedding model for RAG pipelines. 1024-dimensional embeddings. Lower context than specialized embedding models (e.g. nomic-embed-text at 8K) — use for short-to-medium documents.

Quick Selection Guide

Task	Recommended	Why
Complex reasoning / analysis	Mistral Large 3	Best quality across all tasks
Agentic workflows	Mistral Medium 3.5 or Large 3	Strongest tool use + planning
Code generation	Codestral	256K context, FIM support, 80+ languages
Agentic coding (cheap)	Devstral Small	Cheapest capable agentic coder — $0.07/$0.28
High-volume, simple tasks	Mistral Small 4 or Nemo	90% cheaper than Large 3
Image understanding	Pixtral Large	Only Mistral model with vision
RAG embeddings	Mistral Embed	Native Mistral embeddings — or use nomic-embed-text locally
Local self-hosted	Mistral Small 4 / Codestral / Nemo	All three are Apache 2.0 open weights

Checklist: Do You Understand This?

Can you name Mistral's flagship model and its price per million tokens?
Do you know which Mistral model to use for: bulk classification, agentic coding, image analysis?
Can you name three Mistral models that are available as open-weight downloads?
Do you understand why Codestral has 256K context specifically?