🧠 All Things AI
Advanced

Responses API

The Responses API, launched in 2025, is OpenAI's recommended API surface for building agentic and multi-turn applications. It replaces the Assistants API with a dramatically simpler object model while adding first-class support for built-in tools, structured outputs, and MCP server connections. The Assistants API is being shut down on August 26, 2026 — migration is mandatory for any existing Assistants-based applications.

Object Model

The Assistants API required managing five distinct object types: Assistants, Threads, Messages, Runs, and Run Steps. Tracking state across these objects added significant complexity to application code. The Responses API replaces this with a flat, intuitive model:

Assistants API (deprecated)

  • Assistant object (configuration)
  • Thread object (conversation container)
  • Message objects (individual turns)
  • Run object (execution instance)
  • Run Step objects (tool call tracking)
  • Complex polling loop required

Responses API (current)

  • Single Response object
  • Items array (messages + tool calls + tool outputs)
  • Simple create call with optional previous_response_id
  • Streaming supported natively
  • No polling required

Basic Usage

Creating a response in Python with the Responses API:

from openai import OpenAI

client = OpenAI()

response = client.responses.create(
    model="gpt-5",
    input="Summarise the key trends in large language models in 2025."
)

print(response.output_text)

For multi-turn conversations, pass the previous response ID to maintain context:

follow_up = client.responses.create(
    model="gpt-5",
    input="What are the implications for enterprise adoption?",
    previous_response_id=response.id
)

Built-in Tools

The Responses API ships with four built-in tools that can be enabled without any external integration:

  • web_search: Live web search integrated directly into the model's reasoning
  • file_search: Retrieval from OpenAI-hosted vector stores (see Storage page)
  • code_interpreter: Python execution in a sandbox for calculations and data manipulation
  • computer_use: Computer vision and interaction for GUI automation tasks

Enable a tool by including it in the tools array of the request. The model decides when to invoke tools based on the query and available tools — you do not need to write routing logic.

MCP Server Support

From February 2026, the Responses API supports connecting to MCP (Model Context Protocol) servers. This means you can connect your application to any MCP-compatible server — databases, internal APIs, file systems, SaaS tools — and the model can call tools exposed by those servers during inference. MCP dramatically expands what the model can reach without custom function calling code for every integration.

Pricing Notes

Model pricing applies at standard rates. Two additional charges apply when using hosted tools:

  • File search calls: $2.50 per 1,000 calls
  • Vector store storage: $0.10/GB/day (first 1 GB/day free)

Web search and code interpreter do not carry per-call charges beyond the model token costs.

Structured Outputs

The Responses API supports structured outputs — constrained generation that guarantees the model returns a valid JSON object matching a schema you define. This eliminates the need for fragile output parsing:

response = client.responses.create(
    model="gpt-5",
    input="Extract the company name and founding year from: ...",
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "company_info",
            "schema": {
                "type": "object",
                "properties": {
                    "company_name": {"type": "string"},
                    "founding_year": {"type": "integer"}
                },
                "required": ["company_name", "founding_year"]
            }
        }
    }
)

Migrating from Assistants API

The Assistants API shuts down on August 26, 2026. Migration steps:

  1. Replace Assistant objects with system input in the Responses API
  2. Replace Thread + Run management with previous_response_id chaining
  3. Replace tool polling loops with streaming or synchronous responses.create calls
  4. Re-register vector stores — existing Assistants vector stores need migration

Full migration guide: platform.openai.com/docs/guides/migrate-to-responses

Checklist

  • What is the key simplification the Responses API makes over the Assistants API object model?
  • How do you maintain conversation context across multiple Responses API calls?
  • What are the four built-in tools the Responses API provides?
  • What does MCP server support in the Responses API enable?
  • When does the Assistants API shut down, and where is the migration guide?