🧠 All Things AI
Intermediate

OpenAI Platform Overview

The OpenAI Platform at platform.openai.com is the developer-facing layer of OpenAI — an HTTP API that provides programmatic access to models, embeddings, speech, image generation, file storage, vector search, and batch processing. It is entirely separate from ChatGPT: the platform is for building applications, not for end-user chat.

Key API Endpoints

All endpoints are prefixed with https://api.openai.com:

EndpointPurposeNotes
/v1/responsesPrimary completions (Responses API)Recommended for new work; replaces Assistants API
/v1/chat/completionsLegacy Chat CompletionsStateless; still fully supported
/v1/embeddingsText embeddingsUse text-embedding-3-small or text-embedding-3-large
/v1/audio/transcriptionsSpeech-to-textWhisper and gpt-4o-transcribe models
/v1/audio/speechText-to-speechtts-1, tts-1-hd, gpt-4o-mini-tts
/v1/images/generationsImage generationGPT-4o native image; DALL-E 3 deprecated May 2026
/v1/filesFile upload and storageUsed by Responses API, fine-tuning, Batch API
/v1/vector_storesVector store managementHosted RAG; pair with file_search tool
/v1/batchAsync bulk inference50% off; results within 24 hours

Authentication

All API calls require authentication via a secret API key, passed as an HTTP header:

Authorization: Bearer sk-proj-...

API keys are created at platform.openai.com/api-keys. Keys can be scoped to an organisation or further restricted to a specific Project within that organisation. Project-scoped keys are recommended for production — they limit access to only the resources a given application needs.

Store your API key in an environment variable (OPENAI_API_KEY) and never hardcode it in source files or commit it to version control.

Rate Limits

OpenAI enforces rate limits across five dimensions simultaneously. Hitting any one of them triggers a 429 Too Many Requests error:

Rate Limit Dimensions

  • RPM — Requests Per Minute
  • RPD — Requests Per Day
  • TPM — Tokens Per Minute
  • TPD — Tokens Per Day
  • IPM — Images Per Minute (image endpoints)

Handling 429 Errors

  • Implement exponential backoff with jitter
  • Check and respect the Retry-After header
  • Use the Batch API for non-time-sensitive workloads
  • Monitor TPM usage — often hit before RPM

Rate limits are applied per organisation and per endpoint. They are not shared across different model families — GPT-5 limits are separate from o3 limits, which are separate from embedding limits.

Usage Tiers

OpenAI uses an automatic tier upgrade system. As your payment history and usage grow, your organisation is automatically promoted to higher tiers with higher rate limits:

TierEligibilityTypical RPM (GPT-5)
FreeNew account, no payment added3
Tier 1$5+ paid500
Tier 2$50+ paid, 7+ days since first payment5,000
Tier 3$100+ paid, 7+ days5,000
Tier 4$250+ paid, 14+ days10,000
Tier 5$1,000+ paid, 30+ days30,000

Batch API quota is tracked separately from real-time quotas and does not count against your synchronous rate limits.

Checklist

  • What HTTP header carries the API key in OpenAI API requests?
  • What is the difference between the Responses API endpoint and the Chat Completions endpoint?
  • Name the five rate limit dimensions OpenAI enforces.
  • How does tier promotion work — is it manual or automatic?
  • Why should you use project-scoped API keys in production?