Intermediate

OpenAI Platform Overview

The OpenAI Platform at platform.openai.com is the developer-facing layer of OpenAI — an HTTP API that provides programmatic access to models, embeddings, speech, image generation, file storage, vector search, and batch processing. It is entirely separate from ChatGPT: the platform is for building applications, not for end-user chat.

Key API Endpoints

All endpoints are prefixed with https://api.openai.com:

Endpoint	Purpose	Notes
`/v1/responses`	Primary completions (Responses API)	Recommended for new work; replaces Assistants API
`/v1/chat/completions`	Legacy Chat Completions	Stateless; still fully supported
`/v1/embeddings`	Text embeddings	Use text-embedding-3-small or text-embedding-3-large
`/v1/audio/transcriptions`	Speech-to-text	Whisper and gpt-4o-transcribe models
`/v1/audio/speech`	Text-to-speech	tts-1, tts-1-hd, gpt-4o-mini-tts
`/v1/images/generations`	Image generation	GPT-4o native image; DALL-E 3 deprecated May 2026
`/v1/files`	File upload and storage	Used by Responses API, fine-tuning, Batch API
`/v1/vector_stores`	Vector store management	Hosted RAG; pair with file_search tool
`/v1/batch`	Async bulk inference	50% off; results within 24 hours

Authentication

All API calls require authentication via a secret API key, passed as an HTTP header:

Authorization: Bearer sk-proj-...

API keys are created at platform.openai.com/api-keys. Keys can be scoped to an organisation or further restricted to a specific Project within that organisation. Project-scoped keys are recommended for production — they limit access to only the resources a given application needs.

Store your API key in an environment variable (OPENAI_API_KEY) and never hardcode it in source files or commit it to version control.

Rate Limits

OpenAI enforces rate limits across five dimensions simultaneously. Hitting any one of them triggers a 429 Too Many Requests error:

Rate Limit Dimensions

RPM — Requests Per Minute
RPD — Requests Per Day
TPM — Tokens Per Minute
TPD — Tokens Per Day
IPM — Images Per Minute (image endpoints)

Handling 429 Errors

Implement exponential backoff with jitter
Check and respect the Retry-After header
Use the Batch API for non-time-sensitive workloads
Monitor TPM usage — often hit before RPM

Rate limits are applied per organisation and per endpoint. They are not shared across different model families — GPT-5 limits are separate from o3 limits, which are separate from embedding limits.

Usage Tiers

OpenAI uses an automatic tier upgrade system. As your payment history and usage grow, your organisation is automatically promoted to higher tiers with higher rate limits:

Tier	Eligibility	Typical RPM (GPT-5)
Free	New account, no payment added	3
Tier 1	$5+ paid	500
Tier 2	$50+ paid, 7+ days since first payment	5,000
Tier 3	$100+ paid, 7+ days	5,000
Tier 4	$250+ paid, 14+ days	10,000
Tier 5	$1,000+ paid, 30+ days	30,000

Batch API quota is tracked separately from real-time quotas and does not count against your synchronous rate limits.

Checklist

What HTTP header carries the API key in OpenAI API requests?
What is the difference between the Responses API endpoint and the Chat Completions endpoint?
Name the five rate limit dimensions OpenAI enforces.
How does tier promotion work — is it manual or automatic?
Why should you use project-scoped API keys in production?