What counts as a multi-agent system in enterprise AI?

A multi-agent system is any production deployment in which two or more autonomous agents share state, share context, or coordinate on tasks. The boundary is operational rather than architectural: a single agent that calls multiple tools is not multi-agent; two agents that pass intermediate results to each other are. Enterprise multi-agent systems in 2026 are most common in three patterns: research workflows (one agent retrieves, another synthesises), customer-service workflows (a router agent triages to specialist agents), and code-generation workflows (one agent plans, another executes, a third tests). The threshold for considering a deployment multi-agent for governance purposes is the inter-agent communication path; if information flows from one agent's output to another agent's context, the deployment is multi-agent.

Why does the orchestration pattern matter for governance?

Three reasons. (1) Accountability attribution: when something goes wrong, the orchestration pattern determines whether the responsible agent (and therefore the responsible human deployment owner) is unambiguously identifiable. Hierarchical patterns concentrate accountability at the orchestrator; peer-to-peer patterns distribute it; broker-mediated patterns log the inter-agent path explicitly. (2) Audit substrate: the 14-field audit substrate (claim AM-046) needs to capture the full inter-agent decision chain; some patterns produce this naturally, others require deployment-layer instrumentation. (3) Threat surface: the cross-agent prompt-injection class (claim AM-045) propagates differently through each pattern; broker-mediated patterns are easiest to defend, peer-to-peer are hardest.

Why is broker-mediated the recommended 2026 default for high-risk deployments?

Broker-mediated patterns route every inter-agent communication through a centralised broker that enforces policy, logs the message, and applies provenance tagging before forwarding. The broker is the single point at which architectural separation between content-ingest and tool-execution can be enforced for the cross-agent path. The pattern's drawback is the broker is a single point of failure and a performance bottleneck; the recommendation is for high-risk deployments where the audit and security properties outweigh the architectural cost. Low-risk and contained deployments can use hierarchical patterns acceptably.

How does this connect to the A2A protocol?

The A2A (agent-to-agent) protocol, currently in active development by multiple working groups, is a candidate standard for inter-agent communication that would make broker-mediated patterns easier to implement portably across vendors. The protocol's adoption trajectory affects how multi-agent architectures evolve; widespread A2A adoption would shift the default pattern toward broker-mediated by reducing the implementation cost. As of April 2026, A2A is not yet at deployment-grade stability for production use; expect the inflection point in mid-to-late 2026. The full A2A analysis is at /a2a-agent-to-agent-protocol/, claim AM-050.

What does this mean for vendor selection?

Vendors differ materially in the multi-agent patterns they support natively. Microsoft 365 Copilot is platform-integrated and architecturally hierarchical (the Copilot orchestrator manages connections to per-app agents). Anthropic Managed Agents support hierarchical and broker-mediated patterns with strong context-isolation primitives. OpenAI Operator and Assistants are primarily hierarchical with developing multi-agent extensions. Google Gemini is platform-integrated with workspace-scoped multi-agent. The vendor-comparison piece (claim AM-039) covers the per-vendor capabilities; the procurement question for multi-agent deployments is whether the vendor's native pattern matches the pattern this piece recommends for the deployment's risk tier.

Multi-agent architecture playbook for enterprise AI

At a glance

Claim

Enterprise multi-agent architectures resolve to three orchestration patterns (hierarchical, peer-to-peer, broker-mediated) with materially different governance properties: hierarchical concentrates accountability at the orchestrator and is the easiest to audit but the most exposed to orchestrator-compromise; peer-to-peer distributes accountability and is the most resilient to single-agent failure but the hardest to audit; broker-mediated centralises the inter-agent communication path and is the most defensible against the cross-agent prompt-injection class. The choice of pattern is not a free architectural decision in 2026 because the EU AI Act's Article 9 risk-management requirements and the OWASP Agentic AI threat surface impose specific control obligations on each pattern. An enterprise should default to broker-mediated for new deployments above the high-risk threshold; hierarchical is acceptable for low-risk and contained deployments; peer-to-peer should be avoided in production agentic AI in 2026 unless the audit substrate is materially stronger than vendor-native baseline.

Supporting figure

Broker-mediated is the 2026 default pattern for high-risk multi-agent enterprise AI

Date

26 Apr 2026

Verdict

Holding(AM-049)

Next review

25 Jul 2026(+90d)

In 2025, most production enterprise agentic AI was single-agent: one agent, one user, one session. In 2026, multi-agent deployments are emerging in research, customer-service, and code-generation workflows at production scale. The architectural choices these deployments are making vary widely, and the governance consequences of those choices are not yet broadly understood.

What follows is a working playbook for choosing and implementing multi-agent architectures in enterprise contexts: the three orchestration patterns and their governance properties, the threat-surface implications of each, and the 2026 defaults for matching pattern to deployment risk tier.

What counts as multi-agent

A multi-agent system, for governance purposes, is any production deployment in which two or more autonomous agents share state, share context, or coordinate on tasks. The threshold is the inter-agent communication path: if information flows from one agent’s output to another agent’s context window, memory, or tool inputs, the deployment is multi-agent and is subject to the multi-agent threat surface.

This definition includes some deployments enterprises do not intuitively classify as multi-agent. A workflow where Copilot summarises a document and ChatGPT-on-the-side reviews the summary is multi-agent across vendor boundaries. A workflow where a research agent retrieves content and a writing agent drafts a brief is multi-agent within the same vendor. A workflow where one user prompts and the agent’s tool calls invoke other agents on the platform is multi-agent at deployment time.

The definition excludes some deployments enterprises do classify as multi-agent. A single agent that calls multiple tools is not multi-agent; the tools are tools, not agents. A deployment with multiple users each running independent agent sessions is not multi-agent unless the sessions share state. A workflow with humans handing off to agents at named points is human-in-the-loop, not agent-to-agent.

The boundary matters because the multi-agent threat surface (cross-agent prompt injection, agent-communication poisoning, rogue-agent containment) is structurally different from the single-agent threat surface and requires distinct controls.

The three orchestration patterns

Enterprise multi-agent architectures resolve to three patterns. The distinctions are operational rather than theoretical; each pattern has materially different governance properties.

Hierarchical orchestration

A single orchestrator agent receives the user request, decomposes it into subtasks, and delegates the subtasks to worker agents. The orchestrator integrates the results and produces the response. Worker agents do not communicate directly with each other; all inter-agent flow goes through the orchestrator.

Microsoft 365 Copilot’s per-app agent orchestration is hierarchical: the Copilot orchestrator manages connections to per-app agents (Outlook agent, SharePoint agent, Teams agent). Most LangChain / LangGraph deployments default to hierarchical. Anthropic’s structured-prompt orchestration patterns are hierarchical.

Governance properties. Accountability concentrates at the orchestrator: when something goes wrong, the orchestrator’s decision log is the primary forensic source. The audit substrate is the easiest of the three patterns to assemble because the orchestrator already logs the worker-agent invocations as tool calls. The orchestrator is the single point at which control gates can be enforced: action-class approval, drift monitoring, throughput limits all interpose at the orchestrator.

Threat-surface properties. The orchestrator is the single point of compromise. An attacker who manipulates the orchestrator (via prompt injection in the user input, via memory poisoning of the orchestrator’s persistent state) controls the entire system. The cross-agent prompt-injection surface is small relative to peer-to-peer because there are fewer inter-agent paths, but the consequences of a successful attack are larger because compromise propagates downward to all workers.

When to use. Medium-risk deployments where audit ease and operational simplicity outweigh resilience. The pattern’s strength is observability; its weakness is single-point-of-failure at the orchestrator.

Peer-to-peer orchestration

Agents communicate directly with each other without a central coordinator. Each agent has a model of which other agents to consult for which tasks; coordination emerges from the bilateral interactions rather than from a top-down plan.

Peer-to-peer patterns appear most in research and experimentation contexts (Microsoft Research’s AutoGen, multi-agent simulation environments, academic agentic AI work). Production peer-to-peer deployments at enterprise scale are rare in 2026.

Governance properties. Accountability distributes across the agent network. When something goes wrong, identifying the responsible agent and therefore the responsible human deployment owner often requires reconstructing the full inter-agent reasoning trace. The audit substrate is the hardest of the three patterns to assemble because there is no central log of inter-agent communication; deployment-layer instrumentation is required.

Threat-surface properties. The cross-agent prompt-injection surface is large because there are many inter-agent paths and no chokepoint at which to interpose detection. Compromise of one agent can propagate through the network with no architectural barrier. Failure attribution is structurally hard; when the system produces a bad outcome, identifying which agent’s contribution was decisive often requires forensic-quality trace reconstruction.

When to use. Research and experimentation. Production deployments above the low-risk threshold should not adopt peer-to-peer in 2026 unless the audit substrate is materially stronger than vendor-native baseline. The pattern’s strength is resilience and emergent coordination; its weakness is fundamental opacity.

Broker-mediated orchestration

Every inter-agent message routes through a centralised broker. The broker enforces policy (which agents can talk to which), logs the message (with provenance, trust level, content), and applies any required transformations (sanitisation, scope reduction) before forwarding. Agents operate as semi-autonomous principals; the broker is the communication infrastructure rather than the coordinator.

Broker-mediated patterns are emerging in 2026 as the principled approach to multi-agent governance. Anthropic’s Managed Agents support broker-style isolation primitives. The MCP (Model Context Protocol) ecosystem is enabling broker-mediated patterns by standardising the inter-agent message format. The A2A protocol (claim AM-050) would make broker-mediated patterns easier to implement portably across vendors.

Governance properties. Accountability is logged at the broker rather than concentrated at one agent or distributed across many. The audit substrate is produced natively by the broker’s message log: every inter-agent communication is captured with timestamps, source-agent identity, destination-agent identity, content provenance, and trust level. Failure attribution is structurally easy because the broker log is the single forensic source.

Threat-surface properties. The broker is the single point at which the architectural separation between content-ingest and tool-execution privileges (claim AM-045) can be enforced for the cross-agent path. Cross-agent prompt injection attacks must propagate through the broker, where they are detectable. The broker is also a single point of failure (operational risk) and a performance bottleneck (engineering risk); the recommendation is to invest in the broker’s reliability accordingly.

When to use. High-risk deployments. The recommended 2026 default for any deployment classified as high-risk under the EU AI Act Annex III taxonomy or under internal high-risk criteria.

The pattern selection table

Risk tier	Recommended pattern	Acceptable alternatives	Avoid
Low-risk (internal productivity, no externally-affected populations)	Hierarchical	Peer-to-peer for research; broker-mediated overkill	—
Medium-risk (some external population effect, contained business impact)	Hierarchical	Broker-mediated where audit substrate matters	Peer-to-peer in production
High-risk (Annex III, regulated sector, material business impact)	Broker-mediated	Hierarchical with strong audit substrate extension	Peer-to-peer

The recommendations are defaults, not absolutes. An enterprise with a strong reason to depart from the default (specific vendor constraints, specific deployment characteristics, specific operational maturity) can do so, but the rationale for the departure should be documented in the deployment’s risk-management documentation per Article 9.

Per-pattern audit substrate requirements

Each pattern requires the 14-field audit substrate (claim AM-046) to operate across the inter-agent path. The implementation differs.

Hierarchical audit substrate. The orchestrator’s decision log is the primary substrate. Worker-agent invocations are logged as tool calls in the orchestrator’s audit chain (field 12 of the 14-field template). The provenance field (field 6) captures which worker produced which input to the orchestrator. The output disclosure surface field (field 13) captures the final action the orchestrator takes. Implementation cost is low because the orchestrator’s natural logging covers most fields.

Peer-to-peer audit substrate. Deployment-layer instrumentation is required because no agent in the system natively logs the full inter-agent path. The implementation typically involves a sidecar audit service that intercepts inter-agent communications and reconstructs the message graph. Implementation cost is high; the audit substrate often falls short of the 14-field minimum without significant engineering investment.

Broker-mediated audit substrate. The broker’s message log is the primary substrate. Every inter-agent message produces a log entry with all 14 fields populated (the broker has access to source identity, destination identity, content with provenance, planned vs executed disposition, approval references, output disclosure routing). Implementation cost is moderate at deployment time and low at operational time because the substrate is produced natively.

The audit-substrate requirement is the strongest argument for broker-mediated patterns in high-risk contexts: the substrate is a precondition for Article 12 compliance, and broker-mediated patterns produce it as a byproduct of the architectural choice.

Per-pattern threat-mitigation requirements

The seven-control surface from the OWASP Agentic AI Top 10 walkthrough (claim AM-043) applies to multi-agent systems with pattern-specific implementation.

Scoped non-human identity (control 1). Each agent in the system has its own NHI; the inter-agent communication path uses the source agent’s identity for forwarded messages. Hierarchical: the orchestrator’s identity scopes worker invocations. Peer-to-peer: each pairwise communication uses the originating agent’s identity. Broker-mediated: the broker can enforce identity verification on every inter-agent message.

Action-class approval gates (control 2). Approval gates apply to consequential actions across the multi-agent path, not just within a single agent. Hierarchical: the orchestrator gates worker actions with consequential effect. Peer-to-peer: each agent gates its own consequential actions but the multi-agent system can produce consequential effects through actions that are individually low-impact (the Klarna pattern, claim AM-044, can manifest in multi-agent systems through the cumulative effect of many small actions). Broker-mediated: the broker can apply approval-gate policy to message classes, not just action classes.

MTTD-for-Agents detection (control 4). Detection-time targets apply to multi-agent failures, where the failure can manifest only in the inter-agent path. Hierarchical: orchestrator-layer detection covers most failures. Peer-to-peer: detection requires reconstructing the inter-agent reasoning across many sessions; MTTD targets are typically harder to meet. Broker-mediated: broker-layer detection covers cross-agent prompt injection (claim AM-045) natively because every potentially-malicious inter-agent message routes through the broker.

Behavioural drift monitoring (control 6). Multi-agent systems can produce emergent behaviours that no individual agent exhibits. Drift monitoring applies to the system-level output, not just to per-agent outputs. The control’s implementation in multi-agent contexts requires a system-level evaluation harness, not just per-agent evaluation.

Vendor pattern coverage

Vendor / platform	Hierarchical	Peer-to-peer	Broker-mediated
Microsoft 365 Copilot	native	not supported	partial (post-EchoLeak hardening)
Anthropic Managed Agents	native	discouraged	native (context-isolation primitives)
OpenAI Operator + Assistants	native	partial	partial
Google Gemini	native	not supported	partial
LangChain / LangGraph	native	partial	partial (custom implementation)

The vendor coverage as of April 2026 favours hierarchical patterns natively across all vendors. Broker-mediated support is uneven; Anthropic’s Managed Agents have the strongest native support, with other vendors requiring deployment-layer implementation to achieve broker-mediated behaviour. Peer-to-peer is generally either explicitly unsupported or discouraged by the vendor’s documentation.

What this playbook does NOT cover

The playbook addresses architectural pattern selection and the resulting governance properties. It does not address:

Specific multi-agent frameworks (LangGraph, AutoGen, CrewAI, etc.). Framework selection is downstream of pattern selection; the same pattern can be implemented in multiple frameworks.
Inter-agent message format standards (MCP, A2A protocol). These affect implementation portability rather than the pattern’s governance properties. The A2A protocol piece (claim AM-050) covers in detail.
Multi-tenant multi-agent architectures. Enterprise SaaS contexts where multiple tenants’ multi-agent systems share infrastructure introduce additional isolation requirements not covered here.
Federated multi-agent systems across organisational boundaries. B2B agent-to-agent coordination across legal entities is an emerging pattern with regulatory implications (data-residency, contract-binding, liability allocation) that warrant separate treatment.

The full state of enterprise agentic AI is at /state-of-enterprise-agentic-ai/ (claim AM-040). The cross-agent prompt-injection class that drives the broker-mediated recommendation is at /echoleak-cross-agent-prompt-injection/ (claim AM-045). The 14-field audit-evidence template that operationalises Article 12 across multi-agent paths is at /eu-ai-act-article-12-audit-evidence/ (claim AM-046).

The pattern is not just an architectural decision in 2026. It is the decision that determines whether the deployment is auditable, defensible, and compliant. Choose accordingly.

ShareX / Twitter LinkedIn Email

Spotted an error? See corrections policy →

Disagree with this piece?

Reasoned disagreement is a first-class signal here. Every review cycle weighs documented dissent; material dissent becomes part of the article's change history. This is not a corrections form — use /corrections/ for factual errors.

Part of the pillar

Agentic AI governance →

Governance frameworks, oversight patterns, and compliance postures for enterprise agentic-AI deployment. 26 other pieces in this pillar.

What counts as multi-agent

The three orchestration patterns

Hierarchical orchestration

Peer-to-peer orchestration

Broker-mediated orchestration

The pattern selection table

Per-pattern audit substrate requirements

Per-pattern threat-mitigation requirements

Vendor pattern coverage

What this playbook does NOT cover

Score this governance picture on six instrumented dimensions.

Agentic AI governance →

Related reading

A2A protocol: enterprise agent-to-agent interoperability

Retail and logistics AI agents: the 2026 deployment patterns

Six documented agentic AI failure cases and what they teach

AI-written analysis, signed by a practitioner. One or two pieces a week.