Advanced

Storage & Vector Stores

OpenAI's storage layer consists of two systems: the Files API for uploading and referencing documents, and Vector Stores for hosted RAG (retrieval-augmented generation). Together they let you give your AI application a persistent, searchable knowledge base without managing your own vector database, embedding pipeline, or chunking logic.

Files API

The Files API handles document upload and storage for use across OpenAI's platform. Uploaded files are referenced by a file_id that persists until you delete the file — you do not need to re-upload the same document for every API call.

Uploading a File

from openai import OpenAI

client = OpenAI()

with open("annual_report.pdf", "rb") as f:
    file = client.files.create(
        file=f,
        purpose="assistants"  # or "batch", "fine-tune"
    )

print(file.id)  # file-abc123...

File Specifications

Maximum size per file: 512 MB
Maximum tokens per file (for knowledge retrieval): 2,000,000 tokens
Supported types: PDF, Word (.docx), plain text (.txt), code files, spreadsheets, JSONL (for batch/fine-tuning), and images
Purpose field: Set to assistants for Responses API use, batch for Batch API, or fine-tune for fine-tuning jobs

File IDs can be referenced in Responses API requests for direct document analysis, in vector stores for chunked retrieval, in Batch API jobs, and in fine-tuning runs.

Vector Stores (Hosted RAG)

A Vector Store is a hosted, managed retrieval index. You create a store, add files to it, and OpenAI handles the entire RAG pipeline automatically: chunking the documents, generating embeddings, and building the retrieval index. Your application then queries the store through the file_search built-in tool in the Responses API.

Creating and Populating a Vector Store

# Create a vector store
store = client.vector_stores.create(name="product-documentation")

# Add files (single)
client.vector_stores.files.create(
    vector_store_id=store.id,
    file_id="file-abc123"
)

# Add multiple files in batch
client.vector_stores.file_batches.create(
    vector_store_id=store.id,
    file_ids=["file-abc123", "file-def456", "file-ghi789"]
)

Using a Vector Store in Responses API

response = client.responses.create(
    model="gpt-5",
    input="What is the refund policy for premium subscriptions?",
    tools=[{
        "type": "file_search",
        "vector_store_ids": [store.id]
    }]
)

The model will automatically search the vector store when it needs information, retrieve relevant chunks, and incorporate them into its response — full RAG without any retrieval code on your side.

Pricing

Charge	Rate	Notes
Vector store storage	$0.10/GB/day	First 1 GB/day free per organisation
File search calls	$2.50 per 1,000 calls	Each time file_search tool is invoked counts as one call
File storage (Files API)	Included	Files stored at no per-byte charge; only vector store incurs storage fees

Use Cases

Strong Use Cases

Customer support bot over product documentation
Internal knowledge assistant over company policy docs
Legal research tool over case law or contract archives
Developer docs assistant for a software product
Research companion over uploaded academic papers

When to Use a Dedicated Vector DB Instead

You need metadata filtering beyond text similarity
You need hybrid search (vector + keyword BM25)
You need to embed data across multiple AI providers
Volume and cost make self-hosted Qdrant/Weaviate more efficient

Checklist

What is the maximum file size the Files API accepts, and what is the token limit for knowledge retrieval?
What does OpenAI handle automatically when you add files to a Vector Store?
How do you connect a Vector Store to the Responses API during inference?
What does a file search call cost, and how is vector store storage priced?
Name two scenarios where you would prefer a dedicated vector database over OpenAI's hosted Vector Stores.