Model Routing & Fallback
OpenRouter's routing layer sits between your request and the provider. It chooses which provider to use, handles failures automatically, and can optimize for price, speed, or availability.
How Routing Works
OpenRouter routes to the best available provider — and falls back automatically on failure
Provider Routing
Many models are served by multiple providers — for example, Llama models are available via Groq, Together AI, Fireworks, and others. When you request a model via OpenRouter, it routes to the best available provider based on:
- Availability — is the provider currently up?
- Latency — which provider is responding fastest right now?
- Price — which provider is cheapest for this model?
- Your preferences — you can specify ordering:
provider.order
# Control provider routing via extra_body
response = client.chat.completions.create(
model="meta-llama/llama-4-scout",
messages=[{"role": "user", "content": "Hello!"}],
extra_body={
"provider": {
"order": ["Groq", "Together", "Fireworks"], # prefer Groq first
"allow_fallbacks": True # fall back if Groq fails
}
}
)Automatic Failover
OpenRouter's most valuable operational feature is automatic failover. If your primary model or provider goes down, OpenRouter switches to a fallback without you writing any retry logic:
- If Anthropic's API is down, requests for Claude automatically route to a fallback provider
- If a specific model is at capacity, OpenRouter queues or reroutes
- Rate limit errors from one provider trigger routing to an alternative
This is particularly valuable for production applications where a single provider outage would otherwise cause downtime. The failover is transparent — your code doesn't need to handle it.
Cost Optimization Patterns
Best Use Cases for OpenRouter
- Experimenting with many models before committing to one
- Production apps needing high availability across providers
- Teams using 3+ different models who want unified billing
- Builders who want free model access for dev/test without separate accounts
- Startups who want to defer the "which provider" decision
- Ultra-high volume where the 5.5% markup is significant at scale
- Use cases requiring provider-specific features not in the OpenAI-compat API
- Applications requiring strict data residency with specific providers
- Teams who already have dedicated deals with one provider (volume pricing)
LiteLLM — Similar Tool Worth Knowing
LiteLLM is an open-source alternative to OpenRouter that you self-host. It provides the same unified API abstraction but runs on your own infrastructure — no 5.5% markup, no data leaving your environment. The tradeoff: you manage the proxy server, and there's no free model tier. Use LiteLLM when data sovereignty or cost at scale matters more than convenience. Use OpenRouter when you want a managed service with zero ops.
For model pricing context when deciding which models to route to, see the Model Comparison table and Model Selection Cheat Sheet.
Checklist: Do You Understand This?
- Can you explain what automatic failover means in the context of OpenRouter?
- Do you know how to specify provider preferences in an OpenRouter API call?
- Can you describe a cost optimization strategy using OpenRouter's free models?
- Do you understand the difference between OpenRouter (managed) and LiteLLM (self-hosted)?