AI agent architecture is the structural design that decides how a model reasons, calls tools, and coordinates with other agents, and where that whole process can be observed, gated, and audited. Most published guides stop at the first half. They diagram the control flow and leave governance as a closing paragraph. That order is backwards for anyone shipping agents into an organisation, because the architecture choice determines how hard the governance is, not the other way round.

This guide is a taxonomy of the six patterns that cover almost every production agent system: ReAct, plan-and-execute, orchestrator-worker, hierarchical or supervisor, pipeline or sequential, and event-driven. For each one it answers two questions a staff engineer actually has: when is this the right pattern, and where do the four governance controls (permission boundary, audit log, human-in-the-loop checkpoint, observability) physically sit. It does not teach you to build an agent line by line. For that, the companion guide on how to build a custom Claude agent covers SDK setup and working code. This guide is the layer above: the decision about which shape to build before you write it.

The six patterns at a glance

Start with the answer. The table below maps each pattern to its best-fit use case, its dominant failure mode, and the single governance enforcement point that matters most for it. The patterns are not mutually exclusive: real systems compose them, and a hierarchical system is usually orchestrator-worker repeated across tiers. But naming them separately is what makes the trade-offs legible.

Pattern Control flow Best for Dominant failure mode Governance pressure point
ReAct One agent loops: reason, act, observe Open-ended single-agent tasks, unknown path Loops, runaway tool calls Per-call permission check and rate limit
Plan-and-execute Plan first, then execute the plan Long multi-step tasks needing an inspectable plan Stale plan, no replanning Human review of the plan before execution
Orchestrator-worker Central agent decomposes and delegates Genuinely specialised subtasks Orchestrator becomes a bottleneck and single trust point Scope check on each worker's tools
Hierarchical / supervisor Tiers of agents delegate downward Large problems with nested decomposition Lost accountability across tiers Identity and audit trail per tier
Pipeline / sequential Fixed linear chain, output to input Deterministic, repeatable workflows Rigidity, no recovery on bad input Gate checks between stages
Event-driven Agents react to and emit messages Independent scaling, external triggers Lost messages, no global view Audit log as the system of record

The rest of this guide works through each row, then pulls the governance columns into a single enforcement model that applies regardless of which pattern you pick.

Workflows versus agents: the distinction that frames every pattern

Before the patterns, one distinction does more work than any other. Anthropic's engineering team draws it cleanly in Building Effective Agents: workflows are "systems where LLMs and tools are orchestrated through predefined code paths," while agents are "systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks."

That line runs straight through the taxonomy. Pipeline and sequential designs are workflows: the control flow is fixed in code, and the model fills in steps. ReAct and the autonomous end of orchestrator-worker are agents: the model decides what to do next. Plan-and-execute sits between them, a planned workflow with an agentic execution phase. The distinction matters for governance because workflows are easier to constrain (you know the path in advance) and agents are harder (the path is discovered at runtime), which is exactly why the autonomous patterns need the heaviest enforcement.

Both Anthropic and Microsoft give the same operating advice: climb the complexity ladder slowly. Microsoft's Azure Architecture Center frames it as direct model call, then single agent with tools, then multi-agent orchestration, and states the rule plainly: "Use the lowest level of complexity that reliably meets your requirements." Most teams reach for multi-agent too early. A single well-instructed agent with good tools beats a five-agent committee for most tasks, and it has one identity to govern instead of five. The decision of whether you even need multiple agents is upstream of choosing a multi-agent pattern, and the guide on Claude skills versus agents versus MCP covers how those building blocks differ before you commit to a topology.

Pattern 1: ReAct, the reasoning-and-acting loop

ReAct is the default single-agent pattern and the substrate most others build on. It comes from the 2022 paper ReAct: Synergizing Reasoning and Acting in Language Models by Yao and colleagues, which showed that interleaving reasoning traces with actions beats doing either alone. The model thinks, takes an action (a tool call), observes the result, and thinks again, looping until it reaches an answer. As the paper puts it, "reasoning traces help the model induce, track, and update action plans," while "actions allow it to interface with external sources to gather additional information."

In practice this is the loop any agent SDK runs for you: send the prompt and tool schemas, receive either a final answer or a tool call, execute the tool, feed the result back, repeat. Google Cloud lists it directly as the "ReAct" pattern of "thought, action, observation cycles." It is the right choice when the path to the answer is unknown and the task fits one agent's competence: research over an API, a support agent that queries several systems, a coding assistant that reads files and runs commands.

ReAct's weakness is also its defining trait. Because the model decides each next step, nothing stops it from looping, calling the same tool repeatedly on ambiguous results, or running up cost. Every production ReAct deployment needs a turn cap and explicit failure semantics in tool return values.

Where governance sits in ReAct

ReAct concentrates the governance load on the individual tool call, because the tool call is the only point where the agent touches the outside world. Four controls apply at that boundary:

  • Permission boundary. Before each tool executes, check whether this agent's identity is allowed this tool and this resource. ReAct's autonomy means you cannot rely on the prompt to constrain behaviour; the deny has to happen at the call site.
  • Rate and budget limits. The loop is the failure mode, so a per-identity cap on calls, cost, or wall-clock time is not optional. This is the control that bounds a runaway agent.
  • Audit log. Each thought-action-observation cycle should emit a structured record: the tool, the arguments, the result size, and the decision. Without it, debugging a misbehaving loop is guesswork.
  • Observability. Lifecycle hooks around tool start and end give you the per-step trace. The companion build guide shows the concrete observability hooks an SDK exposes for exactly this.

Pattern 2: Plan-and-execute, separating thinking from doing

Plan-and-execute splits the agent into two phases: first produce an explicit plan that decomposes the task into ordered subtasks, then carry out those subtasks. The pattern traces to Plan-and-Solve Prompting by Wang and colleagues, presented at ACL 2023, which framed it as "devising a plan to divide the entire task into smaller subtasks, and then carrying out the subtasks according to the plan." The motivation was concrete: zero-shot chain-of-thought drops steps on long tasks, and forcing an explicit plan first reduces missing-step errors.

The advantage for production systems is not just accuracy. A plan is an artefact you can inspect, log, and approve before any tool runs. That makes plan-and-execute the natural fit for long, expensive, or sensitive tasks: a migration agent that will touch a database, a financial workflow, anything where you want a human to see the intended steps before execution begins. It also reduces model calls compared with ReAct on long tasks, because the agent is not re-deriving the whole plan on every turn.

The failure mode is the stale plan. If the world changes during execution (a step fails, an assumption breaks), a naive plan-and-execute agent ploughs ahead. Robust implementations add a replanning step that revisits the plan when a subtask fails, which is where the pattern starts to blend back into the agentic loop.

Where governance sits in plan-and-execute

This pattern has the single best governance property in the taxonomy: a natural checkpoint. The boundary between the plan and the execution is the obvious place to insert a human-in-the-loop gate.

  • Human-in-the-loop on the plan. Pause after planning and require approval before execution for high-risk tasks. Google Cloud's human-in-the-loop pattern describes exactly this: "integrates points for human intervention directly into an agent's workflow," pausing for approval at critical checkpoints. The plan boundary is where that intervention costs the least and prevents the most.
  • Audit log of the plan itself. The plan is a first-class record. Logging it gives you a clear statement of intent to compare against what actually happened.
  • Permission boundary at execution. Approval of a plan is not approval of arbitrary tool calls. Each step in the execution phase still passes the same per-call scope check ReAct needs.
  • Observability across replanning. When a plan is revised mid-flight, the trace must capture the revision, or the audit trail will not explain why the executed steps diverged from the approved plan.

Pattern 3: Orchestrator-worker, central decomposition

Orchestrator-worker introduces a second agent. A central orchestrator dynamically decomposes the task and delegates subtasks to worker agents, then synthesises their results. Anthropic names it directly as one of its core patterns: "a central LLM dynamically breaks down tasks, delegates them to worker LLMs, and synthesizes their results." The key word is dynamically. Unlike a fixed pipeline, the orchestrator decides at runtime how many workers to spawn and what each one does, which makes it suited to tasks whose shape is not known in advance, such as a research task that fans out into a variable number of sub-investigations.

AWS describes the same shape in its agentic patterns guidance as subagent delegation, one of the agentic workflow patterns it says "promote scalable, composable, and auditable AI architectures." The multi-agent conversation framing in the AutoGen paper by Wu and colleagues formalises the underlying mechanism: customisable agents that coordinate through structured conversation.

The pattern's strength is focus. Each worker has a narrow remit, a small tool set, and instructions undiluted by unrelated concerns, which makes workers easier to test and improve. Its weakness is the orchestrator. It is a single point of failure, a latency bottleneck (everything routes through it), and, for governance, a single concentrated trust point.

Where governance sits in orchestrator-worker

The architecture splits the governance problem into two layers, and the worker layer is where the leverage is.

  • Per-worker permission scope. This is the pattern's defining governance move. Each worker should hold only the tools its remit requires, enforced as least privilege per agent. Microsoft's guidance is explicit that "security trimming must be implemented in every agent" rather than trusting the orchestrator to police access. A worker that only summarises should not hold a worker that can delete.
  • Identity per agent. Workers need distinct machine identities so the audit log can attribute every action to the specific agent that took it, not to a single shared service account.
  • Audit at the synthesis boundary. Log what each worker returned and what the orchestrator did with it. Synthesis is where a worker's output influences a decision, and where a prompt-injection payload from one worker can reach the orchestrator.
  • Observability across the fan-out. A single user request becomes many tool calls across many agents. Tracing has to follow the request through the fan-out, or you lose the ability to reconstruct what happened.

Pattern 4: Hierarchical and supervisor architectures

Hierarchical architectures generalise orchestrator-worker across more than one level. A top-level supervisor decomposes a problem and delegates to mid-level agents, which decompose further and delegate to specialists. Google Cloud lists this as two related patterns: a "coordinator" that decomposes and dispatches, and "hierarchical task decomposition," a "multi-level hierarchy" where "parent agents decompose and delegate to lower levels." It is the right shape when a problem has genuine nested structure, such as a software project broken into subsystems, each broken into modules.

The appeal is that it mirrors how organisations already divide labour, which makes the decomposition intuitive to design. The risk is the same one organisations have: accountability dissolves as it travels down the tree. When a specialist three levels down takes a harmful action, the question "who authorised this?" should still have a crisp answer, and in a sloppily built hierarchy it does not. This pattern is also the one most likely to be over-applied. Many problems that look hierarchical are better served by a flat orchestrator-worker design with one level of delegation.

For teams running several agents that collaborate as peers rather than strict tiers, the related coordination concerns (shared context, handoffs, who owns which task) are covered in the guide on Claude Code agent teams.

Where governance sits in hierarchical architectures

Depth is the enemy of accountability, so the governance controls here are about preserving a chain of custody as authority flows downward.

  • Identity and audit per tier. Every level needs its own identity and its own audit records. The audit trail has to let you walk from a leaf action back up to the supervisor that ultimately authorised the branch.
  • Permission boundaries that tighten downward. Lower tiers should hold strictly narrower permissions than their parents. A specialist should never be able to do something its supervisor could not, or delegation has become privilege escalation.
  • Human-in-the-loop at the top tier. Approval gates belong where authority concentrates. Gating the supervisor's high-level plan is more tractable than trying to review every leaf action.
  • Observability of the full tree. Without a trace that spans every tier, a hierarchical system is a black box wrapped in more black boxes. The whole call tree for one request must be reconstructable.

Pattern 5: Pipeline and sequential, the deterministic chain

Pipeline or sequential architecture chains agents in a fixed, linear order, each one processing the previous agent's output. Microsoft describes it as a design that "chains AI agents in a predefined, linear order," forming "a pipeline of specialized transformations" where the next agent is deterministic rather than an agent's choice. Anthropic's "prompt chaining" is the same idea: decompose a task into a fixed sequence, with programmatic checks possible between steps.

This is a workflow, not an agent, and that is the point. Because the path is fixed in code, the system is predictable, testable, and cheap to reason about. It is the correct default for any process whose steps are known and stable: extract, then validate, then transform, then summarise. You give up adaptability (the chain cannot reroute itself around a bad input) in exchange for determinism, which in regulated or high-volume contexts is usually the right trade.

The failure mode is rigidity. If stage two receives input it cannot handle, a naive pipeline passes the failure downstream. The fix is the same mechanism that makes pipelines so governable: gate checks between stages.

Where governance sits in pipelines

Pipelines are the easiest pattern to govern, because the seams are visible and fixed.

  • Gate checks between stages. Each boundary between agents is a natural validation and policy point. Anthropic explicitly notes the value of "programmatic checks (see 'gate' in the diagram) on intermediate steps." A gate can validate output, scan for sensitive data, and stop the chain before a bad value propagates.
  • Deterministic audit. Because the stages are fixed, the audit log has a known shape. You can assert that a compliant run touched exactly these stages in this order, which is the kind of evidence an auditor wants.
  • Permission boundary per stage. Each agent in the chain still calls tools and still needs least-privilege scoping. The fixed order makes it easy to specify exactly which tools each stage may use.
  • Observability as stage timing. A linear chain makes per-stage latency and error rates trivial to observe, which is also how you find the stage that is degrading.

Pattern 6: Event-driven and reactive systems

Event-driven architecture decouples agents entirely. Instead of one agent calling another, agents publish and subscribe to events on a message bus: an orchestrator emits a task, workers react when they see one they can handle, and results come back as further events. AWS connects this directly to agentic design in its prescriptive guidance, pairing event-driven architecture with agentic patterns to build agents that "operate autonomously while remaining controllable." Confluent's engineering write-up on event-driven multi-agent systems catalogues the variants, including blackboard and market-based coordination.

This is the pattern for scale and integration. Agents can be deployed, scaled, and updated independently, because they share no direct call graph. It suits systems where agents must react to external triggers (a webhook, a queue, a change in a database) and where you need the resilience of asynchronous messaging. The cost is that there is no single place that knows what the whole system is doing at any moment. State is distributed across the event log, and reasoning about end-to-end behaviour is genuinely harder.

Where governance sits in event-driven systems

The decoupling that makes event-driven systems scalable also removes the obvious chokepoint where you would otherwise enforce policy. Governance has to move to the bus.

  • The audit log becomes the system of record. In a system with no central controller, the event log is the only complete account of what happened. It is no longer just an audit feature; it is the source of truth, which means its structure and retention are load-bearing.
  • Policy on publish and subscribe. The permission boundary moves to the messaging layer: which agent identities may emit which event types, and which may consume them. An unrestricted bus is an unrestricted system.
  • Human-in-the-loop as a gated event type. High-risk actions become events that require a human-approval event before a consumer acts on them, rather than an inline pause. The checkpoint is asynchronous, but it is still a checkpoint.
  • Observability through correlation IDs. Because there is no call stack, tracing depends on a correlation identifier carried on every event so a single logical request can be reconstructed from scattered messages.

Where governance sits, unified across patterns

Step back from the individual patterns and the same four controls recur every time, just relocated. The architecture decides where each control physically lives, not whether you need it. This is the part the mainstream architecture guides under-serve, and it is the part that determines whether an agent system is deployable inside an organisation that has to answer to a security team.

Enforcement point ReAct Plan-and-execute Orchestrator-worker Hierarchical Pipeline Event-driven
Permission boundary Per tool call Per execution step Per worker scope Tightens per tier Per stage On publish / subscribe
Audit logging Per loop cycle Plan plus steps At synthesis Per tier, full tree Deterministic per stage The system of record
Human-in-the-loop Inline tool approval Approve the plan Approve delegation Gate the top tier Gate between stages Approval event
Observability Step hooks Across replanning Across fan-out Across the tree Stage timing Correlation IDs

Reading the table across rows exposes the design principle. The permission boundary is always present; it just moves from the tool call to the worker to the message bus. The audit log is always present; in centralised patterns it is a feature, and in event-driven systems it becomes the only truth. Human-in-the-loop always attaches to the point of highest concentrated authority, which is why plan-and-execute and hierarchical designs are the easiest to gate well. Observability always has to span whatever the unit of fan-out is.

That repetition is an argument for putting enforcement somewhere every pattern inherits it, rather than re-coding it inside each agent. Microsoft's guidance lands on the same conclusion from the security side: content-safety guardrails belong "at user input, tool calls, tool responses, and final output," and security trimming must exist "in every agent." If every agent has to re-implement the same four controls, they will drift, and the gaps between implementations are where incidents live. The architectural answer is to enforce at the layer the tool calls travel through, so that the policy is defined once and every client, model, and agent is bound by it.

The four controls as an architecture layer

The cleanest way to hold this is as a layer that sits beneath whichever pattern you choose, between the agents and the tools they call. The diagram below shows the request path and the four enforcement points it passes through.

            ┌─────────────────────────────────────────────┐
   user /   │   AGENT LAYER  (any pattern from this guide) │
   trigger ─▶  ReAct · plan-execute · orchestrator · etc.  │
            └───────────────────────┬─────────────────────┘
                                    │  tool call
            ┌───────────────────────▼─────────────────────┐
            │   GOVERNANCE LAYER  (one policy, all agents) │
            │                                              │
            │   1. Permission boundary  → allow / deny     │
            │   2. Audit log            → structured event │
            │   3. Human-in-the-loop    → pause / approve  │
            │   4. Observability        → trace span       │
            └───────────────────────┬─────────────────────┘
                                    │  permitted call only
            ┌───────────────────────▼─────────────────────┐
            │   TOOL LAYER   APIs · databases · MCP servers│
            └─────────────────────────────────────────────┘

The value of drawing it this way is that the agent layer can be any pattern, or a mix of them, and the governance layer does not change. Swap a ReAct agent for an orchestrator-worker team and the permission checks, audit events, approval gates, and traces are still defined in one place. This is the same reason the choice of agent framework is secondary to the architecture: a framework comparison such as Claude Agent SDK versus LangChain matters for ergonomics, but the enforcement layer is what makes the system governable regardless of which one you pick.

Choosing a pattern for your constraint

Patterns are chosen against constraints, not preferences. The decision is usually settled by whichever of these pressures is tightest for your system.

Optimise for determinism and audit. If the steps are known and you need repeatable, defensible runs, use a pipeline or sequential design. The fixed path is the feature. You give up adaptability you probably do not need.

Optimise for an unknown path. If the agent must figure out the steps at runtime, use ReAct for a single agent or orchestrator-worker if the subtasks are genuinely specialised. Budget for the autonomy with hard turn caps and per-call permission checks.

Optimise for safe execution of risky tasks. If a human must sign off before anything irreversible happens, use plan-and-execute. The plan boundary is the cheapest, highest-leverage approval gate in the taxonomy.

Optimise for nested decomposition. If the problem has real multi-level structure, use a hierarchical design, but be honest about whether one level of orchestrator-worker delegation would do. Most "hierarchical" problems are flat problems with ambition.

Optimise for scale and integration. If agents must scale independently or react to external systems, use event-driven, and accept that your audit log is now the system of record and must be designed as such.

Above all, follow the rule both Anthropic and Microsoft converge on: start with the lowest-complexity pattern that reliably meets the requirement, and add agents only when a single one demonstrably cannot. Complexity in agent systems is not free; every agent you add is another identity to scope, another set of tool calls to audit, and another place a human checkpoint might be needed. The architecture that is easiest to govern is usually the one with the fewest moving parts that still does the job.

Conclusion

Choose the pattern from the constraint, not the constraint from the pattern. Decide where the four governance controls sit before you write the first agent, because retrofitting a permission boundary or an audit trail onto a running multi-agent system is far harder than designing the enforcement layer in from the start. Build the smallest topology that works, give every agent its own scoped identity, and enforce policy at the layer every agent's tool calls pass through. For the implementation that turns one of these patterns into working code, continue with the guide on how to build a custom Claude agent, and for coordinating several agents as a team, see Claude Code agent teams.