Intermediate

Mistral Models

Mistral has built a family of models spanning general reasoning, coding, vision, and embedding โ€” from ultra-cheap (Nemo) to frontier quality (Large 3, Medium 3.5). This page maps each model to its best use case.

Model Families

General Purpose

Mistral Large 3
API only
mistral-large-latest ยท 128K context ยท $2/$6 per 1M
Mistral's flagship general model. Best reasoning, instruction following, and multilingual performance. Use for complex analysis, long documents, and agentic tasks where cost is secondary.
Mistral Medium 3.5
API only
mistral-medium-latest ยท 128K context ยท $1.50/$7.50 per 1M
Frontier-class model positioned between Large and Small. Higher output cost than Large โ€” designed for tasks where generation quality matters more than input volume. Strong agentic capabilities.
Mistral Small 4
Apache 2.0 open-weight
mistral-small-latest ยท 128K context ยท $0.15/$0.60 per 1M
The cost-efficient workhorse. 90% cheaper than Large 3 for tasks that don't need frontier quality. Summarization, classification, drafting, Q&A. Available for download and local use.
Mistral Nemo 12B
Apache 2.0 open-weight
open-mistral-nemo ยท 128K context ยท $0.02/~$0.06 per 1M
Ultra-cheap. Joint release with NVIDIA. Use for high-volume, simple tasks where cost is the primary constraint โ€” e.g., bulk classification or simple reformatting.

Coding Specialists

Codestral
Apache 2.0 open-weight
codestral-latest ยท 256K context ยท $0.30/$0.90 per 1M
Mistral's dedicated code model. 256K context โ€” the largest context of any Mistral model, designed for large codebases and long code generation. Fill-in-the-middle (FIM) support for code completion. 80+ programming languages.
Devstral Small
API only
devstral-small ยท 128K context ยท $0.07/$0.28 per 1M
Agentic coding specialist. Built for multi-file edits, tool use, and coding agent tasks. Very low cost โ€” the cheapest model in the market optimized specifically for agentic coding workflows. Outperforms Codestral Small on SWE-bench.

Multimodal & Embedding

Pixtral Large
API only
pixtral-large-latest ยท 128K context ยท $2.00/$6.00 per 1M
Mistral's vision model. Accepts images + text. Strong at document understanding, chart reading, screenshot analysis. Same price as Large 3 โ€” essentially Large 3 with vision added.
Mistral Embed
API only
mistral-embed ยท 8K context ยท $0.10/โ€” per 1M
Text embedding model for RAG pipelines. 1024-dimensional embeddings. Lower context than specialized embedding models (e.g. nomic-embed-text at 8K) โ€” use for short-to-medium documents.

Quick Selection Guide

TaskRecommendedWhy
Complex reasoning / analysisMistral Large 3Best quality across all tasks
Agentic workflowsMistral Medium 3.5 or Large 3Strongest tool use + planning
Code generationCodestral256K context, FIM support, 80+ languages
Agentic coding (cheap)Devstral SmallCheapest capable agentic coder โ€” $0.07/$0.28
High-volume, simple tasksMistral Small 4 or Nemo90% cheaper than Large 3
Image understandingPixtral LargeOnly Mistral model with vision
RAG embeddingsMistral EmbedNative Mistral embeddings โ€” or use nomic-embed-text locally
Local self-hostedMistral Small 4 / Codestral / NemoAll three are Apache 2.0 open weights

Checklist: Do You Understand This?

  • Can you name Mistral's flagship model and its price per million tokens?
  • Do you know which Mistral model to use for: bulk classification, agentic coding, image analysis?
  • Can you name three Mistral models that are available as open-weight downloads?
  • Do you understand why Codestral has 256K context specifically?

Page built: 01 Jun 2026