Multi-agent architecture playbook for enterprise AI
Three orchestration patterns for enterprise multi-agent systems (hierarchical, peer-to-peer, broker-mediated) with materially different governance properties. The choice is not a free architectural decision under EU AI Act Article 9; broker-mediated is the 2026 default for high-risk deployments.
Holding·reviewed26 Apr 2026·next+90dIn 2025, most production enterprise agentic AI was single-agent: one agent, one user, one session. In 2026, multi-agent deployments are emerging in research, customer-service, and code-generation workflows at production scale. The architectural choices these deployments are making vary widely, and the governance consequences of those choices are not yet broadly understood.
What follows is a working playbook for choosing and implementing multi-agent architectures in enterprise contexts: the three orchestration patterns and their governance properties, the threat-surface implications of each, and the 2026 defaults for matching pattern to deployment risk tier.
What counts as multi-agent
A multi-agent system, for governance purposes, is any production deployment in which two or more autonomous agents share state, share context, or coordinate on tasks. The threshold is the inter-agent communication path: if information flows from one agent’s output to another agent’s context window, memory, or tool inputs, the deployment is multi-agent and is subject to the multi-agent threat surface.
This definition includes some deployments enterprises do not intuitively classify as multi-agent. A workflow where Copilot summarises a document and ChatGPT-on-the-side reviews the summary is multi-agent across vendor boundaries. A workflow where a research agent retrieves content and a writing agent drafts a brief is multi-agent within the same vendor. A workflow where one user prompts and the agent’s tool calls invoke other agents on the platform is multi-agent at deployment time.
The definition excludes some deployments enterprises do classify as multi-agent. A single agent that calls multiple tools is not multi-agent; the tools are tools, not agents. A deployment with multiple users each running independent agent sessions is not multi-agent unless the sessions share state. A workflow with humans handing off to agents at named points is human-in-the-loop, not agent-to-agent.
The boundary matters because the multi-agent threat surface (cross-agent prompt injection, agent-communication poisoning, rogue-agent containment) is structurally different from the single-agent threat surface and requires distinct controls.
The three orchestration patterns
Enterprise multi-agent architectures resolve to three patterns. The distinctions are operational rather than theoretical; each pattern has materially different governance properties.
Hierarchical orchestration
A single orchestrator agent receives the user request, decomposes it into subtasks, and delegates the subtasks to worker agents. The orchestrator integrates the results and produces the response. Worker agents do not communicate directly with each other; all inter-agent flow goes through the orchestrator.
Microsoft 365 Copilot’s per-app agent orchestration is hierarchical: the Copilot orchestrator manages connections to per-app agents (Outlook agent, SharePoint agent, Teams agent). Most LangChain / LangGraph deployments default to hierarchical. Anthropic’s structured-prompt orchestration patterns are hierarchical.
Governance properties. Accountability concentrates at the orchestrator: when something goes wrong, the orchestrator’s decision log is the primary forensic source. The audit substrate is the easiest of the three patterns to assemble because the orchestrator already logs the worker-agent invocations as tool calls. The orchestrator is the single point at which control gates can be enforced: action-class approval, drift monitoring, throughput limits all interpose at the orchestrator.
Threat-surface properties. The orchestrator is the single point of compromise. An attacker who manipulates the orchestrator (via prompt injection in the user input, via memory poisoning of the orchestrator’s persistent state) controls the entire system. The cross-agent prompt-injection surface is small relative to peer-to-peer because there are fewer inter-agent paths, but the consequences of a successful attack are larger because compromise propagates downward to all workers.
When to use. Medium-risk deployments where audit ease and operational simplicity outweigh resilience. The pattern’s strength is observability; its weakness is single-point-of-failure at the orchestrator.
Peer-to-peer orchestration
Agents communicate directly with each other without a central coordinator. Each agent has a model of which other agents to consult for which tasks; coordination emerges from the bilateral interactions rather than from a top-down plan.
Peer-to-peer patterns appear most in research and experimentation contexts (Microsoft Research’s AutoGen, multi-agent simulation environments, academic agentic AI work). Production peer-to-peer deployments at enterprise scale are rare in 2026.
Governance properties. Accountability distributes across the agent network. When something goes wrong, identifying the responsible agent and therefore the responsible human deployment owner often requires reconstructing the full inter-agent reasoning trace. The audit substrate is the hardest of the three patterns to assemble because there is no central log of inter-agent communication; deployment-layer instrumentation is required.
Threat-surface properties. The cross-agent prompt-injection surface is large because there are many inter-agent paths and no chokepoint at which to interpose detection. Compromise of one agent can propagate through the network with no architectural barrier. Failure attribution is structurally hard; when the system produces a bad outcome, identifying which agent’s contribution was decisive often requires forensic-quality trace reconstruction.
When to use. Research and experimentation. Production deployments above the low-risk threshold should not adopt peer-to-peer in 2026 unless the audit substrate is materially stronger than vendor-native baseline. The pattern’s strength is resilience and emergent coordination; its weakness is fundamental opacity.
Broker-mediated orchestration
Every inter-agent message routes through a centralised broker. The broker enforces policy (which agents can talk to which), logs the message (with provenance, trust level, content), and applies any required transformations (sanitisation, scope reduction) before forwarding. Agents operate as semi-autonomous principals; the broker is the communication infrastructure rather than the coordinator.
Broker-mediated patterns are emerging in 2026 as the principled approach to multi-agent governance. Anthropic’s Managed Agents support broker-style isolation primitives. The MCP (Model Context Protocol) ecosystem is enabling broker-mediated patterns by standardising the inter-agent message format. The A2A protocol (claim AM-050) would make broker-mediated patterns easier to implement portably across vendors.
Governance properties. Accountability is logged at the broker rather than concentrated at one agent or distributed across many. The audit substrate is produced natively by the broker’s message log: every inter-agent communication is captured with timestamps, source-agent identity, destination-agent identity, content provenance, and trust level. Failure attribution is structurally easy because the broker log is the single forensic source.
Threat-surface properties. The broker is the single point at which the architectural separation between content-ingest and tool-execution privileges (claim AM-045) can be enforced for the cross-agent path. Cross-agent prompt injection attacks must propagate through the broker, where they are detectable. The broker is also a single point of failure (operational risk) and a performance bottleneck (engineering risk); the recommendation is to invest in the broker’s reliability accordingly.
When to use. High-risk deployments. The recommended 2026 default for any deployment classified as high-risk under the EU AI Act Annex III taxonomy or under internal high-risk criteria.
The pattern selection table
| Risk tier | Recommended pattern | Acceptable alternatives | Avoid |
|---|---|---|---|
| Low-risk (internal productivity, no externally-affected populations) | Hierarchical | Peer-to-peer for research; broker-mediated overkill | — |
| Medium-risk (some external population effect, contained business impact) | Hierarchical | Broker-mediated where audit substrate matters | Peer-to-peer in production |
| High-risk (Annex III, regulated sector, material business impact) | Broker-mediated | Hierarchical with strong audit substrate extension | Peer-to-peer |
The recommendations are defaults, not absolutes. An enterprise with a strong reason to depart from the default (specific vendor constraints, specific deployment characteristics, specific operational maturity) can do so, but the rationale for the departure should be documented in the deployment’s risk-management documentation per Article 9.
Per-pattern audit substrate requirements
Each pattern requires the 14-field audit substrate (claim AM-046) to operate across the inter-agent path. The implementation differs.
Hierarchical audit substrate. The orchestrator’s decision log is the primary substrate. Worker-agent invocations are logged as tool calls in the orchestrator’s audit chain (field 12 of the 14-field template). The provenance field (field 6) captures which worker produced which input to the orchestrator. The output disclosure surface field (field 13) captures the final action the orchestrator takes. Implementation cost is low because the orchestrator’s natural logging covers most fields.
Peer-to-peer audit substrate. Deployment-layer instrumentation is required because no agent in the system natively logs the full inter-agent path. The implementation typically involves a sidecar audit service that intercepts inter-agent communications and reconstructs the message graph. Implementation cost is high; the audit substrate often falls short of the 14-field minimum without significant engineering investment.
Broker-mediated audit substrate. The broker’s message log is the primary substrate. Every inter-agent message produces a log entry with all 14 fields populated (the broker has access to source identity, destination identity, content with provenance, planned vs executed disposition, approval references, output disclosure routing). Implementation cost is moderate at deployment time and low at operational time because the substrate is produced natively.
The audit-substrate requirement is the strongest argument for broker-mediated patterns in high-risk contexts: the substrate is a precondition for Article 12 compliance, and broker-mediated patterns produce it as a byproduct of the architectural choice.
Per-pattern threat-mitigation requirements
The seven-control surface from the OWASP Agentic AI Top 10 walkthrough (claim AM-043) applies to multi-agent systems with pattern-specific implementation.
Scoped non-human identity (control 1). Each agent in the system has its own NHI; the inter-agent communication path uses the source agent’s identity for forwarded messages. Hierarchical: the orchestrator’s identity scopes worker invocations. Peer-to-peer: each pairwise communication uses the originating agent’s identity. Broker-mediated: the broker can enforce identity verification on every inter-agent message.
Action-class approval gates (control 2). Approval gates apply to consequential actions across the multi-agent path, not just within a single agent. Hierarchical: the orchestrator gates worker actions with consequential effect. Peer-to-peer: each agent gates its own consequential actions but the multi-agent system can produce consequential effects through actions that are individually low-impact (the Klarna pattern, claim AM-044, can manifest in multi-agent systems through the cumulative effect of many small actions). Broker-mediated: the broker can apply approval-gate policy to message classes, not just action classes.
MTTD-for-Agents detection (control 4). Detection-time targets apply to multi-agent failures, where the failure can manifest only in the inter-agent path. Hierarchical: orchestrator-layer detection covers most failures. Peer-to-peer: detection requires reconstructing the inter-agent reasoning across many sessions; MTTD targets are typically harder to meet. Broker-mediated: broker-layer detection covers cross-agent prompt injection (claim AM-045) natively because every potentially-malicious inter-agent message routes through the broker.
Behavioural drift monitoring (control 6). Multi-agent systems can produce emergent behaviours that no individual agent exhibits. Drift monitoring applies to the system-level output, not just to per-agent outputs. The control’s implementation in multi-agent contexts requires a system-level evaluation harness, not just per-agent evaluation.
Vendor pattern coverage
| Vendor / platform | Hierarchical | Peer-to-peer | Broker-mediated |
|---|---|---|---|
| Microsoft 365 Copilot | native | not supported | partial (post-EchoLeak hardening) |
| Anthropic Managed Agents | native | discouraged | native (context-isolation primitives) |
| OpenAI Operator + Assistants | native | partial | partial |
| Google Gemini | native | not supported | partial |
| LangChain / LangGraph | native | partial | partial (custom implementation) |
The vendor coverage as of April 2026 favours hierarchical patterns natively across all vendors. Broker-mediated support is uneven; Anthropic’s Managed Agents have the strongest native support, with other vendors requiring deployment-layer implementation to achieve broker-mediated behaviour. Peer-to-peer is generally either explicitly unsupported or discouraged by the vendor’s documentation.
What this playbook does NOT cover
The playbook addresses architectural pattern selection and the resulting governance properties. It does not address:
- Specific multi-agent frameworks (LangGraph, AutoGen, CrewAI, etc.). Framework selection is downstream of pattern selection; the same pattern can be implemented in multiple frameworks.
- Inter-agent message format standards (MCP, A2A protocol). These affect implementation portability rather than the pattern’s governance properties. The A2A protocol piece (claim AM-050) covers in detail.
- Multi-tenant multi-agent architectures. Enterprise SaaS contexts where multiple tenants’ multi-agent systems share infrastructure introduce additional isolation requirements not covered here.
- Federated multi-agent systems across organisational boundaries. B2B agent-to-agent coordination across legal entities is an emerging pattern with regulatory implications (data-residency, contract-binding, liability allocation) that warrant separate treatment.
The full state of enterprise agentic AI is at /state-of-enterprise-agentic-ai/ (claim AM-040). The cross-agent prompt-injection class that drives the broker-mediated recommendation is at /echoleak-cross-agent-prompt-injection/ (claim AM-045). The 14-field audit-evidence template that operationalises Article 12 across multi-agent paths is at /eu-ai-act-article-12-audit-evidence/ (claim AM-046).
The pattern is not just an architectural decision in 2026. It is the decision that determines whether the deployment is auditable, defensible, and compliant. Choose accordingly.
Spotted an error? See corrections policy →
Reasoned disagreement is a first-class signal here. Every review cycle weighs documented dissent; material dissent becomes part of the article's change history. This is not a corrections form — use /corrections/ for factual errors.
Agentic AI governance →
Governance frameworks, oversight patterns, and compliance postures for enterprise agentic-AI deployment. 26 other pieces in this pillar.