Skip to content
Method: every claim tracked, reviewed every 30–90 days, marked Holding, Partial, or Not holding. Drafted by Claude; signed off by Peter. How this works →
AM-007pub07 May 2026rev07 May 2026read10 mininRisk and Governance

AgentFlayer and the cross-agent prompt-injection class: what the vendor-response split tells procurement

Zenity Labs disclosed the AgentFlayer class of zero-click cross-agent prompt-injection attacks at Black Hat USA in August 2025, and the related EchoLeak CVE-2025-32711 was published the same month. Both describe a structural failure mode of agentic AI rather than incidental bugs. The procurement-relevant signal is the vendor-response split: which platforms patched and named a response-SLA against which classified the disclosed behaviour as 'intended functionality'. The split is answerable in writing before the contract closes; the cost of finding out post-deployment is the IBM-grounded breach-cost line plus an audit trail nobody at the procuring enterprise can defend.

Holding·reviewed07 May 2026·next+44d

Bottom line. Zenity Labs disclosed the AgentFlayer class of zero-click cross-agent prompt-injection attacks at Black Hat USA in August 2025; the related EchoLeak vulnerability was assigned CVE-2025-32711 the same month. Both describe a structural agentic-AI failure mode. The procurement-relevant signal is the vendor-response split: which vendors patched and named a response-SLA against which classified the disclosed behaviour as “intended functionality”. The deploying enterprise inherits the vendor’s classification, including under EU AI Act Article 9, NIS2 Article 21, and (for in-scope deployments) DORA. The procurement deck question is which posture the vendor took, in writing, before the contract closes. Source: NVD CVE-2025-32711.

The cross-agent prompt-injection class entered the 2025 enterprise-AI security conversation through two near-simultaneous disclosures. Zenity Labs presented AgentFlayer at Black Hat USA in August 2025, demonstrating zero-click exfiltration vectors against multiple production agentic AI platforms (Zenity Labs research). The same month, CVE-2025-32711 (EchoLeak) was published in the NVD record describing a related cross-agent variant. The shared mechanism is a class of attack rather than a single vulnerability, and the platforms exposed to it are most of the major commercial agentic AI deployment surfaces.

This piece reads the disclosure at the procurement-deck level rather than the SOC level. The procurement signal is not the disclosure itself; it is the vendor-response split that followed. Vendors that treated the disclosed behaviour as a vulnerability requiring a patch, with a named response-SLA, sit in one cohort. Vendors that declined to patch, classifying the behaviour as “intended functionality” of their agent design, sit in another. The deploying enterprise inherits the cohort it bought from.

What the disclosure actually describes

The AgentFlayer pattern, as Zenity Labs documented across the named-platform demonstrations, exploits the prompt-substrate of the agent rather than its surface UI. An attacker embeds instructions in content the agent will process — an email body, an uploaded document, an upstream-agent output, an image carrying hidden text. The receiving agent processes the content as input but executes the embedded instructions as if they were operator-supplied. The injection rides through infrastructure paths the procuring enterprise’s content scanning does not inspect at prompt-substrate granularity: trusted storage URLs, image fetches that resolve to attacker-controlled domains, agent-to-agent calls that pass content forward.

The cross-agent variant (EchoLeak / CVE-2025-32711) extends the mechanism to the case where one compromised or attacker-influenced agent’s output becomes a downstream agent’s input. The downstream agent executes the embedded instructions because the upstream agent is, from a trust-substrate perspective, an authenticated source of content. The class is structural in the sense that defenses operating only at the user-input layer do not cover it; the attack arrives through paths the user-input layer is not asked to inspect.

The platforms named in the public Zenity Labs research and in adjacent reporting span the major commercial agentic AI deployment surfaces. The exposure is not concentrated at one vendor; the class is generic enough that most product designs that share the prompt-substrate-as-instruction-channel architecture are exposed by default.

The vendor-response split is the procurement signal

Public reporting on the disclosure response documents two distinct vendor postures. The procurement implications are asymmetric.

Cohort A: classified as a vulnerability, patched, response-SLA named. Vendors in this cohort engaged with Zenity Labs through a coordinated-disclosure process, deployed mitigations at the platform layer, and committed to a documented response posture for future cross-agent class disclosures. The deploying enterprise procuring from a Cohort A vendor inherits the platform-layer defence and a named SLA against future variants. Its compensating-control burden at the deployment layer is reduced (not eliminated — the class is generic enough that platform defences work in concert with deployment-layer practice).

Cohort B: classified as “intended functionality”, declined to patch. Vendors in this cohort treated the disclosed behaviour as part of the agent design rather than as a vulnerability. The position is defensible in isolation; some agent capabilities depend on the agent processing context-sourced instructions for their utility. The procurement implication is that the deploying enterprise inherits the residual attack surface unmodified at the platform layer. The compensating-control burden lands at the deployment layer entirely. The deploying enterprise’s procurement documentation, audit posture, and incident-response runbook all need to absorb a risk class the vendor’s product has explicitly not patched out.

Neither posture is wrong; both correspond to real product-design tradeoffs. The procurement-deck consequence is asymmetric, and the procuring enterprise’s documentation needs to reflect which cohort the vendor sat in at the time of contract.

Why the deploying enterprise still owns the risk

A common procurement misreading is that vendor classification of “intended functionality” transfers risk to the vendor. The opposite is closer to the truth.

EU AI Act Article 9 requires a continuous, iterative risk-management system across the AI system lifecycle, including for risks emerging post-deployment. The deploying enterprise is the operator under the Act for most procurement scenarios; the vendor’s classification of an attack class as “intended functionality” does not exempt the operator from the Article 9 duty. For Annex III high-risk deployments, the operator’s technical documentation needs to address the residual cross-agent attack surface explicitly.

NIS2 Article 21 (cybersecurity risk-management measures) applies the parallel logic for essential and important entities. Where the agentic deployment touches a regulated workflow, the operator’s risk-management measures are evaluated against the residual exposure, not against the vendor’s product description. DORA carries the same shape for in-scope financial-sector deployments, with the operator’s outsourcing-risk obligations naming the residual third-party-platform-risk explicitly.

The legal architecture is consistent across regimes: the operator inherits the residual risk from the vendor’s classification. The procurement-deck question is therefore not whether the vendor will absorb the risk; it is what the operator’s compensating-control burden will look like, and whether that burden is feasible given the operator’s deployment-layer capabilities.

The IBM breach-cost grounding without the composite arithmetic

IBM’s 2025 Cost of a Data Breach Report, produced annually with the Ponemon Institute, reports a global average data-breach cost of approximately $10.22M for the cohort surveyed in 2024, with documented cost differentials for breaches involving extensive AI use in attack or response. The number is the appropriate baseline for sizing a cross-agent-prompt-injection breach scenario at the procurement-deck level; the methodology is transparent and updated annually.

The original 2025 draft of this URL combined the IBM baseline, an “AI premium” line, and a hypothetical regulatory-fine ceiling into a $92.4M composite figure. That composite does not survive scrutiny because the three components are not additive in the IBM survey methodology: each is measured against a different sample frame, and the AI premium and regulatory-fine ceiling are themselves not bounded by a single per-incident maximum. The honest procurement case cites the IBM baseline, notes that the AI-specific incident premium is documented but variable, and lets the deployment’s own data-classification and exposure surface drive the upper-bound estimate. Procurement decks that ship the composite figure inherit the composite’s audit weakness when the deployment’s actual incident reports surface a different number on a different methodology.

Five pre-pilot questions the procurement committee should add

The five questions add to the AM-140 procurement-committee six rather than replacing them. They focus specifically on the cross-agent class.

  1. Has the vendor’s product been tested against the AgentFlayer / EchoLeak / cross-agent prompt-injection class, and what was the disclosed result? Vendors that engaged with the Zenity Labs disclosure or its successors should be able to point at a public posture statement. Vendors that have not engaged are not necessarily exposed; they may simply be later to the disclosure cycle. The procurement reading is the response posture once the disclosure reaches them.

  2. Does the vendor classify any portion of the disclosed cross-agent class as “intended functionality”, and if yes, what is the documented compensating control the deploying enterprise needs? The “intended functionality” classification is not disqualifying on its own; it is disqualifying if the vendor cannot name the compensating control the deploying enterprise needs to apply at the deployment layer.

  3. What is the response-SLA the vendor commits to for newly disclosed cross-agent vulnerabilities, measured in days from researcher notification to patch deployment, with named accountability inside the vendor org? A response-SLA without named accountability is procurement noise; named accountability without a measurable SLA is procurement theatre. Both, together, describe a vendor whose response posture survives a leadership change.

  4. What is the post-incident communications commitment to the deploying enterprise? The shape: notification timeline (hours, not weeks), scope-of-impact reporting (which tenants, which workflows, which data classes), and forensic-data sharing (logs, traces, the deployment-layer artefacts the operator’s investigation needs). Without these, the operator’s incident-response runbook depends on the vendor’s marketing team’s discretion.

  5. Does the vendor support customer-controlled disable of the agent capabilities most exposed to the cross-agent class, and if yes, at what configuration granularity (per-tenant, per-agent, per-workflow)? The disable lever is the operator’s load-bearing control when an emerging variant lands faster than the response-SLA can accommodate. Per-tenant disable is the minimum useful granularity; per-workflow is the operationally healthy target.

The MTTD-for-Agents framework operationalises questions 4 and 5 specifically: the post-incident-detection time window is what determines whether the disable lever can be pulled before the cross-agent class moves further down the workflow. A vendor that cannot answer the five questions in writing is not failing the procurement decision per se; the procurement decision changes shape, with the residual risk landing on the operator’s deployment-layer practice rather than on the vendor’s platform-layer commitment.

What the data implies for Q2-Q4 2026 procurement

The cross-agent prompt-injection class is unlikely to compress materially through 2026. The class is structural rather than incidental, which means individual variants will be patched as they surface but the underlying mechanism remains a procurement consideration on every agentic AI deployment that uses context-sourced content as part of the prompt substrate. New disclosures in the class will continue to appear; the response posture each time tells the same procurement signal.

The 2026 implication for any enterprise currently evaluating an agentic AI procurement is specific. The five questions above are answerable in writing on a procurement-deck timeline, by the vendor’s product-security team. The cost of asking the questions before the contract closes is one round of vendor-side documentation work. The cost of finding out the answers post-deployment is the IBM breach-cost line, the operator’s regulatory-documentation reconstruction, and an audit trail the operator cannot defend on the operator’s own evidence.

The deploying enterprise is the operator under EU AI Act Article 9, NIS2 Article 21, and DORA where applicable. The vendor’s classification choice does not transfer risk; it shapes the operator’s compensating-control burden. Procurement decisions made without that read inherit the operator’s regulatory exposure unmodified.

Holding-up note

The primary claim of this piece (that AgentFlayer and EchoLeak / CVE-2025-32711 describe a structural cross-agent failure mode rather than incidental bugs, and that the vendor-response split is the procurement-relevant signal answerable before the contract closes) is on a 60-day review cadence. Three kinds of evidence would move the verdict.

A major vendor in the “intended functionality” cohort reversing position publicly with a documented patch and named response-SLA would weaken the cohort-A vs cohort-B framing on which the piece rests, and the piece would need to be re-read against a more compressed split. A new disclosure of a higher-severity cross-agent class superseding AgentFlayer’s procurement weight would strengthen the structural-failure-mode framing and require the questions to be updated against the newer surface. A regulatory action (EU AI Act post-market monitoring, US FTC, sectoral regulator) treating an unpatched cross-agent vulnerability as a compliance breach independent of vendor classification would materially increase the procurement-side weight of the piece’s central claim.

If any land, the Holding-up record for AM-007 captures what changed, dated. Original claim stays visible. Nothing is quietly removed.

ShareX / TwitterLinkedInEmail
Cite this article

Pick a citation format. Click to copy.

Spotted an error? See corrections policy →

Disagree with this piece?

Reasoned disagreement is a first-class signal here. Every review cycle weighs documented dissent; material dissent becomes part of the article's change history. This is not a corrections form — use /corrections/ for factual errors.

Referenced by · 7 pieces
Part of the pillar

Agentic AI governance

Governance frameworks, oversight patterns, and compliance postures for enterprise agentic-AI deployment. 50 other pieces in this pillar.

Related reading

Vigil · 17 reviewed