Subagents Are Not Employees. They Are Context Boundaries.

March 2, 20267 min read

The "AI employee" framing is everywhere and it produces bad architectures. An employee has judgment, learns from feedback, holds institutional context over time, and can escalate when a situation exceeds their authority. An AI agent has none of these. What it has is a context window. Subagents are not employees you hire to handle a domain. They are bounded context windows you use to isolate work, prevent pollution, and parallelize independent tasks. That reframe changes every architectural decision.

What Context Pollution Actually Looks Like

Run a large agentic task in a single context window long enough and you will see it: the agent starts confusing files it read early in the session with the ones it's currently editing. It suggests code from a module it visited twenty tool calls ago. It "remembers" a constraint you mentioned at the start but applies it incorrectly because it's now surrounded by thirty other pieces of information competing for attention.

This isn't a hallucination problem in the pejorative sense — it's a mechanical property of how attention works over long contexts. Information from two thousand tokens ago has less influence over the next token than information from the last hundred. The agent isn't forgetting. The signal is diluting.

A subagent doesn't solve this by being smarter. It solves it by starting fresh. When you spawn a subagent with a focused task and a minimal set of files, you're giving that task the full benefit of a clean context with no competing signals. That's the architectural value.

The Claude Code Agent Tool in Practice

Claude Code's Agent tool is explicit about this. When you invoke it, you write a prompt that contains everything the subagent needs to know — because it won't have access to your conversation history. That constraint is the feature, not a limitation.

A well-structured Agent invocation looks like this:

Agent({
  description: "Audit design token usage in Button component",
  prompt: `
    Read /src/components/Button/Button.tsx and /tokens/colors.ts.
    Identify every hardcoded hex value or pixel measurement in Button.tsx
    that should instead reference a token from colors.ts or spacing.ts.
    Return a list of findings: file path, line number, current value, suggested token name.
    Do not make any edits.
  `
})

Notice what this prompt does not include: the rest of the codebase, the broader refactoring goal, the conversation about why this audit is happening, the output format of other agents running in parallel. The subagent doesn't need any of that. Its task is fully specified. Its output is a structured list.

The orchestrating agent takes that list, combines it with the output of three other parallel audits, and synthesizes a refactor plan. None of the subagents need to know about each other.

When to Use Subagents

Parallelizing independent work

The clearest case. If you have five components to audit and each audit is independent, run five agents at once. Each gets its own clean context with its relevant files. Total wall time is roughly the time for one audit. This is the mechanical advantage of multi-agent: not smarter, just parallel.

Preventing context pollution in long tasks

If a task requires reading thirty files to gather information, then taking action based on that information, consider whether the reading phase and the acting phase should be in different contexts. A research agent that reads broadly and returns a structured summary gives the acting agent clean, high-signal input instead of a context full of raw file content.

Scoping to a specific file set

Some tasks are naturally scoped: "audit the auth module," "refactor the token pipeline," "write tests for the checkout flow." These are good subagent candidates because the relevant file set is bounded. You can specify exactly which files to read, which reduces the risk of the agent pulling in irrelevant context.

When Not to Use Subagents

Small tasks

Spawning an agent has overhead: the prompt has to be written, the context has to be populated, the output has to be parsed and integrated. For a task that would take ten seconds in the main context, a subagent is pure overhead. The rule of thumb: if the task can be expressed as a single tool call or a short sequence of tool calls, do it in the main context.

Tasks that require shared state

Subagents don't share memory. If agent A needs to see the output of agent B before it can proceed, you have a dependency, not a parallelism opportunity. Run B, get its output, use it in A's prompt. That's a sequential pipeline, not a multi-agent architecture — and that's fine. Not everything benefits from parallelism.

Tasks where you need judgment about scope

If the task is "improve the architecture of the data layer," a subagent will do something — but it might not do the right thing, because the scope of "improve" requires understanding what the architecture should become, which requires context you haven't written down. Underspecified tasks produce confidently wrong outputs. Either specify the task properly before spawning, or do it in the main context where you can course-correct interactively.

Designing Context Boundaries Intentionally

A context boundary is a decision about what information a task needs to do its job. Designing that boundary means asking:

What files does this task need to read? (Specify them explicitly in the prompt.)
What information from the main task does this subagent need? (Write it in the prompt. Don't assume it has access to anything else.)
What does this subagent produce? (Define the output format in the prompt. Structured output — JSON, a list, a table — is easier to integrate than prose.)
Does this subagent need to write files, or just return information? (Prefer return-only for research tasks; prefer write-only for execution tasks where the output is a file change.)

The prompt IS the context boundary. Writing a vague prompt produces a vague context boundary, which means the agent will make assumptions about what's in scope — and those assumptions will often be wrong.

The Responsibility Problem

Here's the failure mode the employee metaphor produces: you treat the agent as responsible for the outcome, instead of treating yourself as responsible for the context you gave it.

When an agent produces a bad result, the instinct is "the agent failed." The more useful analysis is "what was missing from the context?" or "what was the context contaminated with?" Because the agent is deterministic on its input — it does what it can with what it has. If what it has is wrong or incomplete, the output reflects that.

This isn't an excuse for agent errors — it's a design principle. When you design a multi-agent system, you're designing information flow. Which agent knows what, when, and in what form. The agent isn't a colleague you brief in a hallway conversation and trust to fill in the gaps. It's a function: garbage in, garbage out, and more input isn't always better input.

The developers building the best agentic workflows I've seen treat every subagent prompt like a function signature: explicit inputs, explicit output contract, no hidden dependencies. The context boundary is the interface. The agent is the implementation.

That's not how you manage an employee. It's how you write a compiler pass.

ai tools