Tech10
Back to blog

Hierarchical, Swarm, Sequential: The Three Multi-Agent Patterns Explained

Hierarchical Swarm Sequential Multi Agent Patterns
AI AgentsMar 24, 20266 min readDoreid Haddad

The Google AI Overview for "multi-agent architecture" names three patterns explicitly: hierarchical (supervisor-worker), swarm/network, and sequential (assembly line). LangChain's recent architecture guide describes four patterns (subagents, skills, handoffs, routers) which map to the same three shapes with different vocabulary. Microsoft's Azure pattern guide and AWS's multi-agent orchestration guidance describe production implementations of the same three.

This article is each one in production form: what it looks like in real code, what it costs, what kind of system you'd build with it, and how each pattern fails. The goal isn't to pick a winner — each pattern has its place. The goal is to give you the right shape for the workload in front of you.

Pattern 1: Hierarchical (Orchestrator-Worker)

What it looks like. A lead agent receives the task, decomposes it into subtasks, and delegates each subtask to a specialized subagent. The subagents work in parallel, each with its own prompt, tools, and (often) its own context window. The lead agent collects the results and synthesizes a final answer.

Anthropic's Research feature is the canonical published implementation. Claude Opus 4 runs as the LeadResearcher, plans the research approach, persists the plan to memory (because the 200K context window can be exceeded on complex queries), and spawns specialized subagents. Each subagent runs Claude Sonnet 4 or similar mid-tier models, performs web searches with interleaved thinking, and returns findings. A separate CitationAgent processes the documents to identify citation locations. The user sees the synthesized result.

Where it wins. Tasks with parallelizable subproblems where each subproblem benefits from focused, specialized reasoning. Research and due diligence (Anthropic's published case). Multi-source data gathering. Comparative analysis across many entities. Document processing where parallel chunks can be analyzed independently and merged.

What it costs. Per Anthropic's published data, multi-agent systems use roughly 15× the tokens of an equivalent chat interaction. The orchestrator-worker pattern is the most-tested shape and has the most predictable cost profile, but the cost is still substantial. Use this pattern when task value justifies the spend.

How it fails. Two failure modes. First, over-decomposition: the lead agent spawns 50 subagents for a query that needed three, burning tokens for no reason. Anthropic's post explicitly mentions this as one of their early bugs — they fixed it through prompt engineering on the lead agent. Second, synthesis bottleneck: the lead agent has to merge possibly conflicting results from subagents, and conflicts that aren't resolved cleanly produce confidently coordinated wrong answers.

Production frameworks that ship this pattern. LangGraph's supervisor pattern, Microsoft AutoGen's group chat, AWS Bedrock multi-agent orchestrator, OpenAI's Assistants API multi-agent setup.

Pattern 2: Swarm (Decentralized / Network)

What it looks like. Multiple agents communicate peer-to-peer without a central controller. Each agent decides what to do based on its own goals and the messages it receives from other agents. Agents can dynamically pass control to whichever agent is best suited for the current state.

The AI Overview describes this as "agents communicate without a central controller, which is effective for dynamic tasks, such as in robotics." That qualifier matters — the swarm pattern shows up in robotics research and certain real-time monitoring applications where agents need to react to local conditions without waiting on a central coordinator.

Where it wins. Real-time systems where the cost of a central coordinator is too high. Distributed monitoring where each agent watches one slice of the system and they collaborate to identify cross-cutting issues. Some research-grade autonomous agent experiments where emergent coordination is the explicit goal.

What it costs. Hard to predict. Without a central orchestrator, total token usage depends on how much agent-to-agent chatter the system generates, which can spiral. Also harder to debug because there's no single trace that captures what the system did.

How it fails. Coordination failures. With no central authority, agents can disagree, deadlock, or duplicate work. The Anthropic data on multi-agent costs (~15× chat-equivalent tokens) is from the orchestrator-worker pattern; swarm patterns can be substantially more expensive because of the inter-agent communication overhead.

Production frameworks that ship this pattern. LangGraph's swarm template, some custom AutoGen configurations, certain decentralized agent research frameworks. In honest production use, swarm is rare — most teams that try it end up converging back toward hierarchical.

Pattern 3: Sequential (Assembly Line / Pipeline)

What it looks like. Agents pass output to each other in a fixed order. Agent A does step one. Output gets validated against a schema. Agent B does step two using the output of step one. Output gets validated. Agent C does step three. And so on. Each agent has a narrow job, a clear input, a clear output, and a strict contract with the next agent.

The classic pipeline shape: extract → validate → enrich → respond. Document processing is the canonical example. An invoice arrives. Agent A extracts structured fields. Agent B validates against the original purchase order. Agent C posts to the accounting system. Each agent does one thing.

Where it wins. Workflows where the steps are genuinely sequential — step two can't start until step one is done — and each step needs different reasoning, different tools, or different specialization. Document processing. Multi-step content generation (research, draft, edit, format). Compliance workflows where each step has its own audit requirement.

What it costs. Roughly comparable to single-agent on the same workload. The smaller, more focused prompts at each step have lower per-call costs that mostly cancel out the overhead of multiple model calls. The win is quality, debuggability, and the ability to swap a single agent without rebuilding the whole system.

How it fails. Latency. Three agents at 800ms each is 2.4 seconds before the user sees anything. If your application has tight latency budgets, sequential pipelines can quietly turn snappy systems into sluggish ones. Streaming intermediate results helps but doesn't eliminate the issue.

Production frameworks that ship this pattern. LangChain's sequential chains, n8n agent pipelines, plain Python orchestration. Most enterprise document-processing and content-generation systems are sequential pipelines, often without the team explicitly using multi-agent vocabulary.

How to pick

A simple decision tree.

Is the task parallelizable into 3+ independent subtasks? Hierarchical (orchestrator-worker). Most production multi-agent systems are this shape because most parallelizable work fits the orchestrator-worker mold cleanly.

Are the steps sequential with different expertise needed at each step? Sequential (pipeline). Document processing, content workflows, multi-step compliance.

Is the system real-time, distributed, or genuinely peer-to-peer? Swarm. Honestly rare outside robotics and specialized monitoring use cases. Default to one of the other two unless you can name a specific reason.

If you can't decide between hierarchical and sequential, build hierarchical first. The pattern is more flexible, the production tooling is more mature, and Anthropic's published evidence suggests it captures most of the multi-agent value for the cost.

The deeper lesson

Multi-agent patterns aren't competing technologies. They're shapes that fit specific kinds of work. The teams who run AI well in 2026 don't pick a pattern and force their workload into it — they look at the shape of the workload first and pick the pattern that already matches.

A workload with five independent subtasks running in parallel is hierarchical. A workload with three sequential steps that need different expertise is sequential. A workload with peer-to-peer dynamics in a distributed environment is swarm. Match the architecture to the work. The framework you implement it in is downstream of that decision and far less important than picking the right shape in the first place.

Frequently Asked Questions

Which multi-agent pattern is most common in production?

Hierarchical (orchestrator-worker). Anthropic's Research feature, AWS's reference architectures, and Azure's design patterns all converge on this shape: one lead agent coordinates, multiple specialized subagents handle parallel subtasks. It's the easiest to debug and the easiest to ship.

When should I use a swarm pattern instead of hierarchical?

Almost never as a first choice. Swarm patterns (decentralized, peer-to-peer agent communication) are powerful when work is genuinely independent and coordination cost is low — robotics, certain real-time monitoring applications. For most business workflows, the lack of central coordination makes them harder to debug and harder to predict.

Is sequential the same as a single agent?

No. Sequential multi-agent means several distinct agents pass output to each other in a fixed order — extract, validate, generate. Each has its own prompt, its own tools, and a typed contract for the handoff. A single agent does all the work in one continuous reasoning chain. Sequential is genuinely multi-agent; it's just less parallel than hierarchical.

Sources
Doreid Haddad
Written byDoreid Haddad

Founder, Tech10

Doreid Haddad is the founder of Tech10. He has spent over a decade designing AI systems, marketing automation, and digital transformation strategies for global enterprise companies. His work focuses on building systems that actually work in production, not just in demos. Based in Rome.

Read more about Doreid

Keep reading