The agent architecture question comes up fast when you start building anything real. One agent that handles everything, or a team of specialized agents coordinated by an orchestrator? Both patterns have strong advocates. Both work. The difference is in what you're building and how much complexity you're willing to manage.
Here's what I've learned running both in production.
The single agent pattern
One agent. Loaded with whatever context and instructions the task requires. At any given moment it might be doing research, writing code, reviewing output, or sending a notification — wearing different hats, drawing on a shared context window.
The appeal is straightforward: everything the agent knows is in one place. The researcher and the builder are the same agent, so there's no handoff problem, no context loss when you pass a document from one agent to another, no coordination overhead. If the task requires reading something you found in step 1 to inform step 4, that's automatic — it's all in context.
Where single agent wins:
- Tasks with strong interdependencies between steps, where earlier findings constantly inform later work
- Work that doesn't benefit from true parallelism — the next step can't start until the previous one is done
- Systems where debugging needs to be simple — one agent means one log stream, one context to inspect
- Resource-constrained environments where running multiple full agent sessions is expensive
Where it struggles:
- Tasks with genuinely parallel subtasks — if step 3 and step 4 are independent, you're leaving time on the table
- Situations where tool access should be isolated — if your research agent and your deployment agent share the same permissions, you're accepting more risk than necessary
- Very long tasks that exceed a single context window
The multi-agent pattern
Multiple agents, each with a defined scope, coordinated by an orchestrator that delegates work and assembles results.
A research agent that can access the web. A build agent that can write and execute code. A review agent that can read output and flag issues. The orchestrator — which might be another Claude session, a queue runner, or application code — assigns work, waits for results, and decides what happens next.
Where multi-agent wins:
- Tasks with genuinely parallel subtasks — research multiple topics simultaneously, run multiple tests at once
- Work that benefits from tool isolation — the research agent has web access, the build agent has file system access, neither has the other's permissions
- Long-running systems where you want to run agents on demand rather than keeping one large context alive
- Complex pipelines where different stages require different models or different system prompts
Where it struggles:
- Coordination overhead is real. Every handoff costs tokens, latency, and an opportunity for information to get lost.
- Debugging is harder. When something goes wrong in a multi-agent system, it's often in the gap between agents — the orchestrator misread an output, the specialist agent returned something in an unexpected format. You now have multiple log streams to correlate.
- More moving parts means more failure modes. An agent that fails silently in a single-agent system is annoying. In a multi-agent system, it can cascade.
The hidden costs of multi-agent
The seductive thing about multi-agent architectures is that they sound efficient. Parallel execution, specialized tools, clean separation of concerns. All of that is real. But there's a cost side that doesn't get discussed as much.
Orchestration overhead. The orchestrator needs to understand the state of all running agents, parse their outputs, handle errors, and make decisions about what to delegate next. If your orchestrator is itself a Claude session, you're spending tokens on coordination that a single agent would spend on actual work.
Format brittleness. Single agents can handle ambiguity — they understand their own output. When agent A passes output to agent B, agent B has to parse it. Structured output formats help, but every handoff is a place where the format can drift or the contract can break.
Debugging surface. A bug in a single-agent system has one context window to inspect. A bug in a multi-agent system might live in any of several agent contexts, or in the handoffs between them. The observability investment required scales with the number of agents.
The decision framework
A few questions that actually help:
Are the subtasks genuinely parallel, or just decomposable? If you can do research and writing in parallel, multi-agent saves time. If the writing depends on the research, you're not gaining true parallelism — you're just adding coordination.
Does tool isolation matter for your security model? If you need to guarantee that your research agent can't write to the file system, a separate agent with different permissions is the right call. If tool overlap is acceptable, the isolation benefit disappears.
How much debugging time are you willing to invest? Multi-agent systems need investment in observability, structured output contracts, and error handling. If you don't make that investment, you'll pay for it when something breaks.
What's the task duration? Long-running background tasks often benefit from multi-agent decomposition because you can run components on demand rather than keeping a single long context alive. Short interactive tasks rarely do.
What actually works
The pattern I've found most useful isn't at either extreme. One primary agent that handles the majority of a task, with occasional parallel delegation for subtasks that are genuinely independent and time-sensitive. The primary agent maintains context and continuity; specialists run on demand for bounded, well-defined work.
The orchestration layer is thin — a queue runner that picks up cards, tracks state in files, and sends notifications when things complete. Not a complex coordination system, just enough structure to run tasks reliably without you watching.
This isn't elegant architecture. It's the minimum that works reliably, built up from what actually failed in simpler setups.
I build with Claude every day and write about what it's actually like to ship AI-powered products. Subscribe at shoofly.dev/newsletter — building AI products in the real world, not what the press releases say.