Multi-Agent Systems
A single agent with many tools hits a practical ceiling: too many tools degrades decision quality, a single context window limits parallelism, and a monolithic agent is hard to test and maintain. Multi-agent systems solve this by distributing work across specialised agents that each do one thing well. This page covers when multiple agents are justified, the five fundamental coordination patterns, how agents communicate, and the failure modes that multi-agent architectures introduce.
When Multiple Agents Are Justified
Five Coordination Patterns
1. Orchestrator-Worker (Hierarchical)
A central orchestrator agent receives the user's goal, breaks it into subtasks, delegates each to a specialist worker agent, collects results, and synthesises the final output. Workers do not communicate with each other β all coordination flows through the orchestrator.
2. Sequential Pipeline
Agents are arranged in a fixed chain: Agent A processes input, passes output to Agent B, which passes to Agent C. Each agent has a specialised prompt and tool set for its stage. The pipeline is deterministic β the sequence does not change based on intermediate results.
3. Parallel Fan-Out (Scatter-Gather)
A coordinator dispatches the same task (or related tasks) to multiple agents simultaneously. All agents run concurrently. A gather step collects and consolidates all results before producing a final output.
4. Handoff (Dynamic Routing)
An agent decides at runtime to transfer control to a different specialist agent based on what it has learned so far. Unlike sequential pipeline (fixed order), handoffs are dynamic β the routing decision is made by the LLM, not pre-coded. The original agent passes its accumulated context to the receiving agent.
5. Critic-Revise (Evaluator Loop)
A generator agent produces output; a critic agent evaluates it against defined criteria; if the output fails, it is returned to the generator with the critique for revision. The loop continues until quality criteria are met or a max iteration limit is reached.
Inter-Agent Communication
How agents pass information to each other is a design decision with significant cost and reliability implications.
| Communication method | How it works | Best for | Downside |
|---|---|---|---|
| Direct message passing | Agent A's output is directly passed as Agent B's input message | Sequential pipelines; simple orchestrator-worker | Tightly coupled β changing Agent A's output format breaks Agent B |
| Shared state object | All agents read/write to a centralised state object (LangGraph StateGraph) | Complex DAG workflows; agents that need access to multiple prior results | State schema must be defined upfront; concurrent writes need locking |
| Message queue / async | Agents publish/subscribe to a message bus (Redis, SQS); decoupled async execution | High-volume, long-running workflows; agents with variable latency | More infrastructure; harder to trace and debug |
| External memory / DB | Agents write results to a shared database; downstream agents query what they need | Parallel fan-out where downstream agents need selective access to upstream results | Requires schema discipline; agents must know what keys to read |
Context Propagation
The most common multi-agent mistake is losing context at agent boundaries. When Agent A hands off to Agent B, Agent B needs to know:
original_goal, steps_completed, key_findings, reason_for_handoff, constraints. The receiving agent's system prompt is initialised from this object.Framework Implementation
LangGraph β graph-based state machine
Define each agent as a node. Edges between nodes are either fixed (sequential pipeline) or conditional (routing based on state values). A centralised StateGraph object holds all shared state and is checkpointed to storage between steps β enabling resumability and human-in-the-loop pauses.
OpenAI Agents SDK β agents + handoffs
Defines agents as objects with a system prompt, tool list, and handoff list. Handoffs are first-class β specifying a handoff target causes the SDK to automatically transfer context and switch the active agent. Built-in tracing. Released March 2025 as the production replacement for the experimental Swarm.
AutoGen / Microsoft Agent Framework β conversational multi-agent
Agents communicate by sending messages to each other in a conversation. Structured conversation patterns (two-agent, group chat, nested chat) handle common coordination scenarios. Microsoft merged AutoGen with Semantic Kernel in October 2025 for enterprise deployments.
Failure Modes
Context loss at agent boundaries
The receiving agent does not have enough context to continue the work sensibly β it re-does steps already completed, contradicts prior decisions, or asks the user to re-explain. Fix: standardised handoff objects; test each agent boundary as an independent unit with realistic context inputs.
Coordination overhead exceeds benefit
The latency of orchestrating multiple agents (extra LLM calls for planning, waiting for parallel results, synthesising outputs) makes the multi-agent system slower and more expensive than a single agent would have been. Measure wall-clock latency and total cost before and after splitting into multiple agents.
Infinite critic-revise loops
A Critic-Revise pattern without a hard iteration limit will loop indefinitely if the critic never approves (e.g. because the criteria are impossible to satisfy). Always implement a maximum loop count and a fallback β after N iterations, either accept the best attempt or escalate to human review.
Partial failure in scatter-gather
In parallel fan-out, some agents succeed and some fail. If the gather step requires all results, one slow or failed agent blocks the whole system. Implement timeouts per parallel agent and a partial-results policy β decide in advance whether the system can produce output with 3/4 results vs requiring all 4.
Observability gap
Multi-agent systems are substantially harder to debug than single agents because failures can be in any agent or at any transition. Without distributed tracing that spans all agents in a run, failures are nearly impossible to diagnose from final output alone. Instrument every agent with a shared trace ID from the start.
2025β2026 Developments
OpenAI Agents SDK with first-class handoffs β March 2025
OpenAI released the production Agents SDK in March 2025 as a direct replacement for Swarm. It introduced handoffs as a first-class primitive β defining which agents can receive control transfers, with automatic context propagation and built-in tracing through the OpenAI dashboard.
Google's Agent-to-Agent (A2A) protocol β 2025
Google published the Agent-to-Agent (A2A) protocol β an open standard for inter-agent communication that enables agents built on different frameworks or by different vendors to interoperate. This addresses the βwalled gardenβ problem where agents built on LangGraph cannot directly communicate with agents built on AutoGen. Adoption was still nascent by end of 2025 but accelerating.
Microsoft merges AutoGen + Semantic Kernel β October 2025
Microsoft merged its AutoGen framework with Semantic Kernel into a unified Microsoft Agent Framework, targeting enterprise deployments on Azure AI. This created a single, supported path for enterprise teams wanting multi-agent capabilities on the Microsoft stack, replacing the fragmented AutoGen vs Semantic Kernel choice.
Checklist: Do You Understand This?
- Can you name the five multi-agent coordination patterns and give an example of each?
- When should you use multiple agents, and when should you stick with a single agent?
- What four pieces of context must a receiving agent have when control is handed off to it?
- Can you describe the standardised handoff object pattern and what fields it contains?
- What is the difference between shared state (LangGraph) and direct message passing for inter-agent communication?
- Why must a Critic-Revise loop always have a maximum iteration limit?
- Can you name five failure modes specific to multi-agent architectures?
- What is Google's A2A protocol and what problem does it solve?