OpenAI Platform Overview
The OpenAI Platform at platform.openai.com is the developer-facing layer of OpenAI — an HTTP API that provides programmatic access to models, embeddings, speech, image generation, file storage, vector search, and batch processing. It is entirely separate from ChatGPT: the platform is for building applications, not for end-user chat.
Key API Endpoints
All endpoints are prefixed with https://api.openai.com:
| Endpoint | Purpose | Notes |
|---|---|---|
/v1/responses | Primary completions (Responses API) | Recommended for new work; replaces Assistants API |
/v1/chat/completions | Legacy Chat Completions | Stateless; still fully supported |
/v1/embeddings | Text embeddings | Use text-embedding-3-small or text-embedding-3-large |
/v1/audio/transcriptions | Speech-to-text | Whisper and gpt-4o-transcribe models |
/v1/audio/speech | Text-to-speech | tts-1, tts-1-hd, gpt-4o-mini-tts |
/v1/images/generations | Image generation | GPT-4o native image; DALL-E 3 deprecated May 2026 |
/v1/files | File upload and storage | Used by Responses API, fine-tuning, Batch API |
/v1/vector_stores | Vector store management | Hosted RAG; pair with file_search tool |
/v1/batch | Async bulk inference | 50% off; results within 24 hours |
Authentication
All API calls require authentication via a secret API key, passed as an HTTP header:
Authorization: Bearer sk-proj-...API keys are created at platform.openai.com/api-keys. Keys can be scoped to an organisation or further restricted to a specific Project within that organisation. Project-scoped keys are recommended for production — they limit access to only the resources a given application needs.
Store your API key in an environment variable (OPENAI_API_KEY) and never hardcode it in source files or commit it to version control.
Rate Limits
OpenAI enforces rate limits across five dimensions simultaneously. Hitting any one of them triggers a 429 Too Many Requests error:
Rate Limit Dimensions
- RPM — Requests Per Minute
- RPD — Requests Per Day
- TPM — Tokens Per Minute
- TPD — Tokens Per Day
- IPM — Images Per Minute (image endpoints)
Handling 429 Errors
- Implement exponential backoff with jitter
- Check and respect the
Retry-Afterheader - Use the Batch API for non-time-sensitive workloads
- Monitor TPM usage — often hit before RPM
Rate limits are applied per organisation and per endpoint. They are not shared across different model families — GPT-5 limits are separate from o3 limits, which are separate from embedding limits.
Usage Tiers
OpenAI uses an automatic tier upgrade system. As your payment history and usage grow, your organisation is automatically promoted to higher tiers with higher rate limits:
| Tier | Eligibility | Typical RPM (GPT-5) |
|---|---|---|
| Free | New account, no payment added | 3 |
| Tier 1 | $5+ paid | 500 |
| Tier 2 | $50+ paid, 7+ days since first payment | 5,000 |
| Tier 3 | $100+ paid, 7+ days | 5,000 |
| Tier 4 | $250+ paid, 14+ days | 10,000 |
| Tier 5 | $1,000+ paid, 30+ days | 30,000 |
Batch API quota is tracked separately from real-time quotas and does not count against your synchronous rate limits.
Checklist
- What HTTP header carries the API key in OpenAI API requests?
- What is the difference between the Responses API endpoint and the Chat Completions endpoint?
- Name the five rate limit dimensions OpenAI enforces.
- How does tier promotion work — is it manual or automatic?
- Why should you use project-scoped API keys in production?