Skip to content
Topic pillar · 59 tracked pieces

Topic · Agentic AI governance

Governance frameworks, oversight patterns, and compliance postures for enterprise agentic-AI deployment.

Where the moat is. Most enterprise agentic-AI failure happens here, not at the model layer.

Governance is the topic this publication writes about most because governance is where most enterprise agentic-AI deployments fail. Not at the model layer — Anthropic, OpenAI, Google, and Microsoft all ship competent base models. Failure happens at the policy layer, the identity layer, the audit layer, the procurement layer. The pieces that survive review on this site are the ones that name a specific governance gap and a specific evidence trail.

The pillar covers six recurring threads. Frameworks the enterprise actually applies — GAUGE (the Enterprise Agentic Governance Benchmark), MTTD-for-Agents (Mean Time To Detect adapted from SRE), NIST AI RMF mappings. Vendor governance posture audits — what Salesforce Agentforce, Microsoft Copilot Studio, Google Vertex Agent Builder, and ServiceNow Now Assist actually disclose versus what their marketing claims. Regulatory translation — EU AI Act timelines, NIS2 and DORA implications for agentic systems, and what specific articles like §12 audit-evidence requirements mean operationally.

Centralized vs federated governance models — when each works, when each breaks. The Head-of-AI-Governance role specification — why the role exists, what reporting line it requires, what its first 90 days look like. Multi-agent architecture risks — A2A protocol behaviour, cross-agent prompt injection of the EchoLeak class, agent-to-agent credential leakage.

What we won't publish: governance frameworks invented for the byline. Every framework cited here resolves to a primary source — a regulator, a peer-reviewed paper, an analyst firm with disclosed methodology, or one of the two house frameworks (GAUGE / MTTD) whose amendment logs are public.

Pillar last refreshed 2026-05-01

What survives review

What has broken

Spoke articles

  • AI coding agents are now an enterprise attack surface: what TrustFall and SymJack mean for the software supply chain

    In May 2026 security researchers published two findings, TrustFall and SymJack, that broke the same assumption across every major AI coding agent at once: Claude Code, Cursor, Gemini CLI, GitHub Copilot CLI, OpenAI Codex CLI, and Grok all treated the on-screen approval prompt as informed consent, and all could be driven to remote code execution by a booby-trapped repository. Microsoft separately disclosed two prompt-injection-to-RCE bugs in its own agent runtime, Semantic Kernel. When a flaw is shared by every product in a category, the category has a design assumption that does not hold. For the enterprise, the consequence is concrete: the coding agent your developers run with their full credentials is a production attack surface, and most governance programmes have it filed under developer tooling, outside the inventory entirely.

  • The SP 800-53 gap for AI agents, and what NIST COSAiS is writing to close it

    Enterprises mapping agentic AI to NIST SP 800-53 today find real gaps in four control families: access control, identification and authentication, audit and accountability, and supply-chain risk. NIST's COSAiS project is writing agent-specific control overlays to close them, but the finalized guidance is not expected before 2027. Until it arrives, the burden is on the enterprise to document compensating controls.

  • AI Made Attackers Faster, Not Smarter

    The fear is that AI hands attackers a new class of capability. The 2026 Verizon DBIR, drawing on data covering 793 enforcement-actioned threat actors, finds the opposite: AI scales the techniques attackers already had, while vulnerability exploitation has overtaken stolen credentials as the top way in. For a CISO that redirects priority from hunting novel AI threats to the controls that scale: patch velocity and identity hygiene.

  • Agent memory governance: the data class with no retention schedule, residency policy, or audit-evidence pipeline

    Persistent agent memory is a new class of confidential-data processing. Most enterprises have no retention schedule, residency control, or audit-evidence pipeline for it. Identity governs who the agent is; eval governs whether it is right; nobody is governing what it remembers, where that memory lives, and how long it persists. Under GDPR storage-limitation and EU AI Act record-keeping, agent memory is an unsized compliance surface.

  • The agent kill-switch: turning 'you can't stop it' into a containment architecture

    Kiteworks' 2026 Data Security and Compliance Risk Forecast found 60% of organisations cannot quickly terminate a misbehaving AI agent and 63% cannot enforce purpose limitations on what agents are authorised to do. The structural reading is that most enterprises have written kill criteria into the risk register and have not built kill architecture into the runtime. The four-primitive containment architecture (purpose binding, kill switch, network isolation, credential revocation) is the instrument for closing the gap, and the tabletop test is the only proof it works.

  • Prompt injection just crossed the RCE threshold: what the May 2026 Semantic Kernel and MCP CVEs mean for enterprise AI agent frameworks

    Microsoft Security Response Center disclosed two Semantic Kernel CVEs on 7 May 2026 in which a single attacker-controlled prompt resolves to host-level code execution. The same week, OX Security published a configuration-to-command path in Anthropic's MCP STDIO interface that traverses every published MCP server implementation. Windsurf 1.9544.26 carries a separate prompt-injection-to-MCP-registration path that automatically installs a malicious server with no user interaction. Three independently-disclosed CVE classes in a single fortnight, all at the framework layer rather than the deployment layer, are not a coincidence. They map a structural property of how 2026 agent frameworks treat tool-configuration data, and the operational implication for enterprise architecture is larger than any single patch.

  • Single-agent or multi-agent: what the 2026 deployment record actually says

    The 2025–2026 deployment record shows single-agent architectures win on accuracy, cost, and MTTD below roughly 12 tool-domains. Multi-agent only pays back above that threshold, and only when inter-agent state is bounded by a shared structured artifact.

  • Agentic AI in legal services: what survives the billable-hour decomposition

    Three of the six billable-hour sub-tasks capture durable value with agentic AI. Two increase malpractice risk vs a junior-associate equivalent at the same time-to-delivery. One is bounded by conduct rules, not technology. The evidence from AmLaw 100 deployments now allows a clear-eyed breakdown.

  • IBM Watson Health and the change-management variable: what the canonical failure tells procurement

    IBM Watson Health launched in 2015 with a $5 billion-plus investment trajectory and was sold to Francisco Partners in 2022 at roughly a fifth of that. The technology was substantively functional; the organisational integration was not. RAND Corporation's 2024 study (n=65 senior data scientists) puts the AI-project failure rate at approximately 80%, dominated by organisational rather than technical causes. The procurement-deck implication is operational: the change-management variable belongs in the discovery phase upstream and in the procurement decision itself, not as a post-deployment afterthought when the named-owner question surfaces at audit.

  • IT operations and agentic AI: why this team is the highest-exposure workforce population

    The enterprise IT operations workforce is structurally the highest-exposure population to autonomous-action AI. The task surface that defines the IT-ops role family — incident triage, configuration management, ticket processing, routine diagnostics, scripted remediation — maps onto the agent-class capability boundary more directly than any other large enterprise job-family. Public-sector workforce data places IT-ops roles at the top of both the displacement and the role-transformation lists. The procurement-deck question for the CIO is not whether the IT-ops role mix changes but on what timeline, against which named roles, and whether the transition posture is agent-orchestration or agent-replacement.

  • AgentFlayer and the cross-agent prompt-injection class: what the vendor-response split tells procurement

    Zenity Labs disclosed the AgentFlayer class of zero-click cross-agent prompt-injection attacks at Black Hat USA in August 2025, and the related EchoLeak CVE-2025-32711 was published the same month. Both describe a structural failure mode of agentic AI rather than incidental bugs. The procurement-relevant signal is the vendor-response split: which platforms patched and named a response-SLA against which classified the disclosed behaviour as 'intended functionality'. The split is answerable in writing before the contract closes; the cost of finding out post-deployment is the IBM-grounded breach-cost line plus an audit trail nobody at the procuring enterprise can defend.

  • EU AI Act Article 50: the disclosure UX that actually satisfies the 2 August 2026 transparency obligation

    Article 50 of the EU AI Act takes effect 2 August 2026 and creates four distinct transparency obligations across chatbot interactions, deepfake content, biometric categorisation, and emotion recognition. Most enterprises have absorbed the legal text without designing the disclosure UX it requires. The procurement-defensible posture is to specify the UX patterns up-front because the deadline does not allow for retrofit.

  • Agent identity at the IAM and Kubernetes layer: the 2026 control-plane decision tree for non-human identity

    The conceptual case for non-human identity for AI agents was made in the corpus at AM-029. The implementation cut — which IAM control plane fits which agent topology — was deferred. This piece walks the four major IAM platforms (Okta NHI, Microsoft Entra ID Workload Identities, Auth0, Keycloak), the Kubernetes-native option (SPIFFE/SPIRE), and the AWS-native option (IAM Roles Anywhere), with a vendor-neutral decision tree that maps deployment topology to control plane.

  • Pharma and life sciences agentic AI in 2026: the 21 CFR Part 11, GxP, EMA, and EU AI Act playbook

    Pharma agentic AI inherits five regulatory regimes simultaneously: 21 CFR Part 11, GxP under GAMP 5, EMA Annex 11 (now in 2025-2026 revision), the EMA AI reflection paper, and the EU AI Act. The audit substrate that satisfies any one of them does not by default satisfy the others. The 2026 procurement gap is treating the regimes as substitutable.

  • Agent red-teaming in 2026: the OWASP Agentic Top 10 companion, the four disciplines, and the evidence model

    The OWASP Agentic Top 10 names what to defend against. It does not say how to test that the defences work. The 2026 enterprise red-team for agentic systems is a distinct discipline from generalised pen-testing, with its own methodology, tooling, and evidence model. Most enterprises run the wrong test and pass.

  • 90 days to EU AI Act enforcement: what the corpus says enterprises still haven't done

    Ninety-one days to 2 August 2026. The publication has tracked eleven enterprise claims against the EU AI Act enforcement window. Four operational-evidence claims are at material risk of moving to Partial in Q3. The governance-process work is mostly done; the operational-evidence work mostly is not. Articles 9, 12, and 26 require the second.

  • Works councils and the EU AI rollout: why deployments stall before they fail

    AI agent deployments in EU jurisdictions with co-determination law need works council consent before they touch employee work. Most US-headquartered AI vendors do not yet have a customer-success workflow for this, producing stalled rollouts that read as 'vendor delay' but are actually compliance gaps.

  • The AI policy void at major pension funds in 2026

    Trillion-dollar capital pools have written position papers on board diversity, executive pay, and climate, but on AI specifically the largest sovereign-wealth and pension funds have published almost nothing. The absence is a structural signal that public-company AI strategies are being rated against expectations the funds have not committed to in writing.

  • Healthcare agentic-AI governance: HIPAA, FDA classification, and the licensure fiction CIOs must reconcile

    Healthcare agentic-AI sits across HIPAA, FDA software-as-medical-device guidance, and state-board licensure rules. The three regimes do not compose cleanly. Five controls reconcile them.

  • D&O insurance and the AI-supervision claim: where Caremark meets agentic AI in 2026

    A class of derivative actions is forming around board failure to supervise AI deployments, and D&O carriers are responding at renewal with explicit AI questionnaires and emerging exclusions. The board-level liability surface most directors have not yet read in their actual policy language.

  • Agent SLA architecture: what 'production-ready' actually means for autonomous, non-deterministic actors

    Traditional SLAs were drafted against deterministic systems. Autonomous agents produce variable outputs by design. The four metrics that actually work for agents are action-bounded availability, MTTD-for-Agents, output-distribution drift, and per-class action error budget. Vendors that cannot expose these are not yet production-ready.

  • The retraining gap: what the surviving 70% need to learn after AI displaces 30% of a function

    Enterprises planning the headcount-reduction half of an agentic-AI rollout are systematically under-budgeting the upskilling cost for the residual workforce. The skills the AI replaces are not the skills the survivors need.

  • Agentic-AI insurance and underwriting: the 2026 coverage gap CIOs and CROs should surface before renewal

    The 2026 insurance market does not yet offer agent-specific E&O policies in mature form. Existing cyber and tech-E&O wordings were drafted against human-error and software-defect risk models that do not cleanly map to autonomous reasoning actors.

  • Data residency for agentic AI: what CIOs must ship before EU AI Act enforcement on 2 August 2026

    Agentic-AI residency obligations are not cleanly inherited from GDPR cross-border practice. Context windows, retrieval indexes, and reasoning traces create new categories of personal-data processing that have to be located, documented, and (for high-risk deployments) data-resident inside the EEA before Article 16 enforcement opens.

  • Agent observability stack: the four layers production agentic-AI actually needs (and what each one misses)

    Production agentic-AI in 2026 needs four observability layers: infrastructure, LLM-call, trace, and output. Most enterprise deployments instrument only the cheaper subset. The failure modes the missing layers catch are the ones that produce the next regulatory enforcement headline.

  • Agent incident response: the six-step playbook for when an autonomous-AI deployment breaks production

    Traditional IT incident response was built for deterministic systems with binary failure modes. Agent incidents are non-binary — partial, intermittent, reasoning-dependent — and the standard runbook does not cover six of the steps the response now requires. The CIO playbook for an agent in production breakage.

  • Why this publication has a ledger — and the analyst sites it benchmarks against don't

    The single structural feature that distinguishes this publication from every site a senior IT leader currently subscribes to is a public claim ledger. None of the named comparables — Stratechery, The Information, the Substack analyst stack, the Big-4 research blogs, Gartner, Forrester, IDC — maintain one. The reason is not negligence.

  • The AI-author signature decision: why this publication signs every piece 'Written by Claude · Curated and signed by Peter'

    Five publishable byline formats exist for AI-authored enterprise commentary in 2026. Four are in active use across the analyst-publication category. This site picked the fifth, and the choice is the second-most-consequential editorial decision after the claim ledger.

  • Learning AI by doing AI: 90 days of measured rework across two ventures

    Rework rate, measured as deletions over total churn, ran from 8.1% on Rhino-basketball to 13.5% on agentmodeai across the same 90-day window. The number is meaningfully lower than typical solo-developer projects but substantially higher than the 'AI codes it once correctly' marketing narrative implies. The data is the evidence, not the framing.

  • Offensive security and the clockspeed gap: why CIOs cannot defend AI-era threats with defensive-only postures

    AI did not just give attackers new tools. It gave them a faster OODA cycle. The senior IT leader running a defensive-only posture in 2026 is running at human clockspeed against attackers running at agent clockspeed. The gap is the risk.

  • Claude Mythos: what 'too dangerous to release' means for your risk appetite and cyber posture

    Anthropic announced a model that found thousands of zero-days, then withheld it from public release. Two weeks later, unauthorized users were inside it. The threat model senior IT leaders were planning for in 2028 just arrived in Q2 2026.

  • The State of Enterprise Agentic AI 2026

    An aggregate analytical report on enterprise agentic AI in 2026, drawing from approximately 60 tracked claims. The deployment record is bimodal, the vendor landscape converged to four credible plays, the governance gap is structural, and the EU AI Act enforcement window opens 2 August 2026. The defining variable for the year is deployment discipline, not model capability.

  • Retail and logistics AI agents: the 2026 deployment patterns

    Five retail and logistics agentic AI workflow patterns with different governance properties: customer service (Klarna failure mode), inventory forecasting, dynamic pricing (antitrust exposure), supply-chain orchestration, returns and fraud detection. Augmentation beats replacement; the headcount-replacement framing has produced reversals.

  • Public sector agentic AI: the 2026 procurement constraints

    Five constraints that materially narrow public-sector agentic AI procurement in 2026: FedRAMP authorisation, sovereign data residency, procurement transparency, administrative-law accountability, FOIA-equivalent audit-log disclosure. The NYC MyCity case is the canonical failure.

  • OWASP Agentic AI Top 10: the enterprise walkthrough

    A walkthrough of the OWASP Agentic Security Initiative's 10 threat classes for enterprise security teams. Each class mapped to a specific control, a specific GAUGE dimension, and a specific MTTD-for-Agents detection-time target.

  • NIST AI RMF mapping for enterprise agentic AI

    Mapping the NIST AI Risk Management Framework's four functions (Govern, Map, Measure, Manage) onto enterprise agentic AI deployment work. The same artefacts that satisfy EU AI Act Article 9 cover NIST AI RMF substantially. The reverse mapping requires more work.

  • Multi-agent architecture playbook for enterprise AI

    Three orchestration patterns for enterprise multi-agent systems (hierarchical, peer-to-peer, broker-mediated) with materially different governance properties. The choice is not a free architectural decision under EU AI Act Article 9; broker-mediated is the 2026 default for high-risk deployments.

  • MCP and the coming standard for enterprise agent tooling

    Model Context Protocol reached enterprise procurement gravity in 18 months. The 10,000+ active public servers, adoption by ChatGPT, Cursor, Gemini, Copilot, and VS Code, and the December 2025 Linux Foundation donation made MCP a tooling-layer choice that ripples through every adjacent agentic-AI decision. The procurement question is not whether to adopt; it is which servers, which scopes, and how cross-agent delegation gets governed.

  • HIPAA-compliant agentic AI: the 2026 healthcare playbook

    Four conditions for HIPAA-compliant agentic AI deployment in U.S. healthcare in 2026: BAA covering the agent workflow, dual-purpose audit log structure, PHI flow mapping under minimum necessary, clinical-correctness drift monitoring. Anthropic's three-cloud BAA position is structurally distinct.

  • The Head of AI Governance role specification, 2026

    The role specification for the Head of AI Governance: six accountabilities, executive-committee reporting line, $250K-$1.2M compensation range, 60% F100 adoption per Forrester. The single strongest predictor of enterprise readiness.

  • EU AI Act Article 12 audit-evidence template for agentic AI

    A 14-field audit-evidence template that operationalises EU AI Act Article 12 record-keeping requirements for agentic AI deployments. Captures every agent decision in regulator-queryable form. Designed for under-4-business-hour evidence assembly.

  • EchoLeak and the cross-agent prompt-injection class

    EchoLeak (CVE-2025-32711) is not a Microsoft 365 Copilot bug. It is the canonical example of a class of attacks affecting any architecture where an agent ingests untrusted content and has tool surfaces capable of exfiltration. Closing the class requires architectural separation, not point-fixes.

  • Centralized vs federated AI governance: the 2026 design choice

    Three AI governance organisational models (centralised, federated, hybrid) with materially different scaling and compliance properties. Hybrid is the dominant Fortune 500 pattern in 2026. The right model depends on deployment count, regulatory exposure, and existing risk-management maturity.

  • When AI writes about AI: the case for tracked claims

    Most enterprise-AI publications hide their AI use. A few disclose it. This site argues the disclosed model produces more verifiable commentary, and the ledger is the proof.

  • The AI agent risk register: 2026 enterprise template

    A 12-column risk register template that operationalises EU AI Act Article 9 and NIST AI RMF Manage. Integrates threat surface, controls, audit substrate, and kill-criterion enforcement into a single living artefact owned by the Head of AI Governance.

  • The agentic AI readiness diagnostic: 10 questions for the high-performing tail

    10 questions auditing the operating profile of the high-performing 6-12% enterprise agentic AI cohort. Answer 8 to 10 YES for the high-performing tail. Answer 4 or fewer YES for the operating profile of the 88-94% struggling segment.

  • Six documented agentic AI failure cases and what they teach

    Six publicly documented agentic AI deployment failures from 2024-2025: Air Canada, NYC MyCity, Replit, Cursor, Klarna, DPD. Three structural failure modes, mapped to the seven-control surface. The pattern is consistent enough to use as a procurement filter.

  • A2A protocol: enterprise agent-to-agent interoperability

    The A2A (Agent2Agent) protocol is the most credible 2026 candidate for cross-vendor agent interoperability. MCP handles agent-to-tool; A2A handles agent-to-agent. Adoption trajectory points to deployment-grade stability in H2 2026 with widespread enterprise rollout in 2027.

  • The EU AI Act and agentic AI: what August 2026 actually requires

    The 2 August 2026 enforcement deadline applies high-risk-system obligations to most enterprise agentic AI deployments operating in EU jurisdiction. The operational scope is broader than the Annex III categories suggest, and the compliance gap most enterprises face is structural. Building the evidence layer post-hoc is the failure mode.

  • The enterprise agentic AI governance playbook for 2026

    Most enterprise agentic AI governance in 2026 is compliance theater. The board sees an EU AI Act map; the deployments shipping out of IT ops have no.

  • Agentic AI in financial services: five frameworks

    Financial services sit at the intersection of DORA, NIS2, MiFID II, EU AI Act, and GDPR. Agentic AI inherits every obligation. The sector playbook.

  • The unverified citation chain: where enterprise AI decisions actually come from

    Vendor claims reach CIO procurement decisions through a four-link chain: earnings call to analyst note to trade press to board deck. No link in that.

  • Agentic AI got real in Q1 2026. Most enterprise charters were written for a different quarter.

    Gartner said 28%. Stanford said 62%. Unit 42 said the prompt-injection attacks are now in the wild at commercial scale. Three data points, one quarter.

  • DMAIC for agentic AI deployment: why the 87% / 27% success gap reflects measurement discipline, not methodology

    Six Sigma organisations report 87% success with agentic AI against 27% for organisations without. The obvious reading is that DMAIC accelerates AI. The honest reading is that the causation runs the other way.

  • Multi-agent systems in manufacturing: the 30% downtime claim, examined

    The 30% reduction in unplanned downtime is the most-cited single figure in manufacturing AI. The 2026 case-study record supports it, but only for a narrow architectural pattern. What the underlying studies actually measured, and where the figure gets over-cited.

  • Agentic AI Centers of Excellence: who actually staffs them, who doesn't

    The Agentic AI CoE pattern across enterprise IT in 2026. Where the model works, where it stalls, and the staffing realities — function lead, evaluation owner, governance interface — that determine which side a deployment lands on.

  • Agentic-AI action-approval gates: the CISO control set for autonomous-actor authority

    AI agents now hold action authority over vendor payments, procurement approvals, and contract steps in production enterprise deployments. Current segregation-of-duties controls were built for human approvers and static service accounts; neither shape fits an autonomous reasoning actor. The CISO control set is a four-part bundle: action-approval gates by blast radius, kill-switch protocols, decision-audit trails, and per-action revocation.

  • Why your agentic-AI deployment needs an AI Training Lead

    The AI Training Lead — the human who curates training data, evaluates model outputs, and tunes prompts — has quietly become a budget-line for enterprise agentic-AI deployments. Domain experts tend to outperform pure-ML hires in the role. CIOs that do not budget for it see their projects fail at the integration boundary.

  • AI readiness in organizations: The 2024-2025 landscape

    Global AI spend is on track for $644 billion, yet only 9% of firms have reached true AI maturity — and 30% of generative-AI pilots will be abandoned.

What we're watching next

  • EU AI Act enforcement letters from member-state competent authorities after the 2 August 2026 deadline.The first wave of enforcement actions will surface what the Act actually polices in practice. Several pillar articles assume an aggressive enforcement posture; others assume a permissive one. The early letters will sharpen both.
  • EDPB guidance specific to agentic AI context windows and reasoning traces.The board has published on AI generally but not on agent-specific data flows. Targeted agent guidance is the most likely 2026-2027 development that would shift the data-residency analysis from speculative to settled.
  • ISO/IEC 42001 certification audit reports becoming public.Once enterprises start publishing ISO 42001 certificates as procurement signals, the gap between the standard's text and what audits actually require becomes visible. Pillar pieces on certification effort estimates will need updating.
  • Major frontier-vendor governance disclosures shifting on Responsible Scaling / Preparedness frameworks.Anthropic and OpenAI publish capability evaluations; Google and Microsoft are expected to follow. Disclosure cadence becoming a vendor-comparison axis would change the procurement playbook.

Primary sources we trust for this topic

A curated list of primary research, regulator guidance, and vendor documentation for agentic ai governance. Populated on the quarterly refresh — not a link dump, not competitors.


This pillar page is refreshed quarterly. Last refresh: 19 Apr 2026. Next refresh: 18 Jul 2026.

Vigil · 44 reviewed