Intermediate

Model Routing & Fallback

OpenRouter's routing layer sits between your request and the provider. It chooses which provider to use, handles failures automatically, and can optimize for price, speed, or availability.

How Routing Works

Your request
model: 'anthropic/claude-sonnet-4-6'
OpenRouter
Select provider
Provider A
Anthropic direct
Fallback
If A fails → Provider B
Response

OpenRouter routes to the best available provider — and falls back automatically on failure

Provider Routing

Many models are served by multiple providers — for example, Llama models are available via Groq, Together AI, Fireworks, and others. When you request a model via OpenRouter, it routes to the best available provider based on:

  • Availability — is the provider currently up?
  • Latency — which provider is responding fastest right now?
  • Price — which provider is cheapest for this model?
  • Your preferences — you can specify ordering: provider.order
# Control provider routing via extra_body
response = client.chat.completions.create(
    model="meta-llama/llama-4-scout",
    messages=[{"role": "user", "content": "Hello!"}],
    extra_body={
        "provider": {
            "order": ["Groq", "Together", "Fireworks"],  # prefer Groq first
            "allow_fallbacks": True                       # fall back if Groq fails
        }
    }
)

Automatic Failover

OpenRouter's most valuable operational feature is automatic failover. If your primary model or provider goes down, OpenRouter switches to a fallback without you writing any retry logic:

  • If Anthropic's API is down, requests for Claude automatically route to a fallback provider
  • If a specific model is at capacity, OpenRouter queues or reroutes
  • Rate limit errors from one provider trigger routing to an alternative

This is particularly valuable for production applications where a single provider outage would otherwise cause downtime. The failover is transparent — your code doesn't need to handle it.

Cost Optimization Patterns

Free tier for dev/test
Route all development and testing requests to :free model variants. Zero cost for experimentation. Switch to paid models in production.
Cheapest provider routing
Set sort: 'price' in provider preferences to always route to the cheapest provider serving a model. Useful for batch workloads.
Task-based model switching
Use a router in your own code to send simple tasks to a cheap model (Nemo, Gemma 3) and complex tasks to a frontier model (GPT-4o, Claude Sonnet). OpenRouter unifies billing.
Batch via free models
For non-real-time jobs (classification, tagging, summarization at scale), batch through free models during low-traffic periods. 50 free req/day adds up across multiple free models.

Best Use Cases for OpenRouter

Great fit
  • Experimenting with many models before committing to one
  • Production apps needing high availability across providers
  • Teams using 3+ different models who want unified billing
  • Builders who want free model access for dev/test without separate accounts
  • Startups who want to defer the "which provider" decision
Less ideal
  • Ultra-high volume where the 5.5% markup is significant at scale
  • Use cases requiring provider-specific features not in the OpenAI-compat API
  • Applications requiring strict data residency with specific providers
  • Teams who already have dedicated deals with one provider (volume pricing)

LiteLLM — Similar Tool Worth Knowing

LiteLLM is an open-source alternative to OpenRouter that you self-host. It provides the same unified API abstraction but runs on your own infrastructure — no 5.5% markup, no data leaving your environment. The tradeoff: you manage the proxy server, and there's no free model tier. Use LiteLLM when data sovereignty or cost at scale matters more than convenience. Use OpenRouter when you want a managed service with zero ops.

For model pricing context when deciding which models to route to, see the Model Comparison table and Model Selection Cheat Sheet.

Checklist: Do You Understand This?

  • Can you explain what automatic failover means in the context of OpenRouter?
  • Do you know how to specify provider preferences in an OpenRouter API call?
  • Can you describe a cost optimization strategy using OpenRouter's free models?
  • Do you understand the difference between OpenRouter (managed) and LiteLLM (self-hosted)?

Page built: 01 Jun 2026