Skip to content
Method: every claim tracked, reviewed every 30–90 days, marked Holding, Partial, or Not holding. Drafted by Claude; signed off by Peter. How this works →
AM-111pub29 Apr 2026rev29 Apr 2026read9 mininRisk & Governance

Agent incident response: the six-step playbook for when an autonomous-AI deployment breaks production

Traditional IT incident response was built for deterministic systems with binary failure modes. Agent incidents are non-binary — partial, intermittent, reasoning-dependent — and the standard runbook does not cover six of the steps the response now requires. The CIO playbook for an agent in production breakage.

Partial·reviewed29 Apr 2026·next+60d
Rewrite in progress

This piece predates the current editorial standard and is in the rewrite queue. The body below is retained for link integrity while the new analysis is prepared. When the rewrite ships, the claim (AM-111) moves from Partial to Holding and the update is dated in the correction log.

Air Canada lost a chatbot case in February 2024 (Moffatt v. Air Canada, 2024 BCCRT 149). The airline’s chatbot had told a grieving passenger he could apply retroactively for a bereavement fare; the airline’s policy said otherwise; the BC Civil Resolution Tribunal held the airline liable and explicitly rejected the argument that the chatbot was a separate legal entity. Two years on, the case is still the most-cited example of the legal premise underneath every enterprise agent deployment: the agent’s output is the enterprise’s output. There is no jurisdictional shield in the agent layer.

That premise is what makes agent incident response a different discipline from standard SRE incident response. The standard runbook (PagerDuty alert, on-call engineer, diagnose, rollback) was designed against deterministic systems where the same input produces the same output and the failure mode is binary. Agent failures break those assumptions. The same prompt at two different times can produce different outputs. Rollback is the wrong primitive when the harm has already propagated into a customer interaction or a vendor contract that must now be honoured. Stack traces are unhelpful when the failure lives in a reasoning chain rather than a code path.

This piece sets out the six steps an enterprise agent-incident playbook adds on top of the standard SRE scaffolding. The Google SRE handbook incident-response chapter (sre.google) remains the right scaffolding for the human-coordination side; the six steps below are the agent-specific overlay.

Report: where agent incidents have already happened publicly

Three public cases set the legal and operational reference points.

Moffatt v. Air Canada (February 2024). The BC Civil Resolution Tribunal held Air Canada liable for incorrect bereavement-fare guidance its chatbot had given a passenger (decision text). The airline’s argument that the chatbot was a separate legal entity was rejected. The decision establishes a foundational point for every customer-facing agent deployment: outputs the agent produces are outputs the enterprise has produced, for the purposes of liability.

Mata v. Avianca (June 2023). The Southern District of New York sanctioned two attorneys who had filed a brief containing six citations to nonexistent cases a legal-research AI had fabricated (court order, Castel J.). The operational lesson is the verification gap between AI output and human submission. An enterprise agent that produces output destined for a court, a regulator, or a counterparty is in the same structural position; verification is a process step, and its absence is a process failure not a model failure.

Zillow Offers wind-down (November 2021). Zillow announced on its Q3 2021 earnings call (Zillow IR release) that it was shutting down its iBuying division after AI pricing models had produced significant inventory write-downs and a workforce reduction of approximately 25% of staff. The failure mode was not a single bad output; it was systematic drift in a model’s behaviour over months as housing-market dynamics shifted, with no detection threshold that fired before the financial damage compounded.

The three cases cover the failure space: customer-facing agent producing wrong guidance under contract law (Air Canada), research agent producing fabricated outputs that bypassed verification (Mata), and decision-making agent drifting against changing input distributions until financial damage compounded (Zillow). Each maps to a different gap in standard SRE incident response.

Observe: the six steps a standard runbook does not cover

The six steps below assume the standard SRE coordination scaffolding is already in place: an incident commander, a communications lead, a scribe, severity classification, the command-channel discipline that PagerDuty, Atlassian, and Datadog incident-response process docs codify (PagerDuty incident response documentation, Atlassian incident handbook). The six are the agent-specific overlay on top of that scaffolding.

Step 1: action-class containment before root-cause analysis. The standard SRE instinct is to diagnose first and contain second; for agent incidents the order reverses. The agent is still capable of taking new actions while the response team investigates the existing ones. The first move is to pause the class of action that broke (financial-system writes, customer-facing communications, code-repository commits) without taking the agent fully offline. The action-authority control set at /your-ai-agents-just-approved-2-7m-in-vendor-payments-and-other-nightmares-keeping-cisos-awake/ (claim AM-063) is the prerequisite: deployments that hold only a hard-stop kill-switch are forced to choose between operational continuity and incident response.

Step 2: reasoning-trace forensics. A standard SRE incident produces a stack trace; an agent incident produces an output that was wrong and the question of why. Reconstructing the answer requires the prompt the agent received, the intermediate conclusions it reached, the tool calls in sequence, the data each call returned, and the final output. The reasoning trace is the equivalent of the stack trace and is not produced by default in most 2026 deployments. The Layer 3 work in the IAM extension at /non-human-identity-ai-agents/ (claim AM-037) is what makes the trace recoverable; without it, post-incident analysis runs on inference rather than evidence.

Step 3: blast-radius reconstruction across downstream agents and systems. A bad output from one agent rarely stays put. Another agent consumes it as input; a downstream system (CRM, ERP, ticketing) writes it as a record; a customer-facing surface displays it. The incident commander identifies every consumer and decides per consumer whether to stop, correct, or notify. The agent-to-agent delegation chains the IAM extension makes queryable are the primary input; without them the reconstruction is best-effort and the next-day surprise (a vendor calling about a contract step the enterprise did not know an agent had taken) is common.

Step 4: stakeholder notification with the specific failure mode named. Standard incident communication says “we experienced an outage from time T1 to T2.” Agent incidents need failure-mode vocabulary: hallucination, tool-routing error, authorisation misuse, instruction-following failure, or reasoning drift (Zillow-class). The specificity matters because the affected parties’ legitimate questions differ by class. A customer who received a hallucinated answer needs the corrected answer; a counterparty who received a contractually significant action needs to know whether the action stands. The Air Canada decision is precedent that obscuring the source does not transfer liability.

Step 5: regulatory exposure assessment for in-scope deployments. EU AI Act Article 73 obliges providers of high-risk AI systems to report serious incidents to the relevant market-surveillance authority within timeframes that vary by severity (EU AI Act Regulation 2024/1689, Article 73). Sector regulators add their own obligations: UK FCA on financial services, OCC and FFIEC member agencies on US banking, FDA on regulated medical software. CISA’s AI security guidance and ENISA’s Multilayer Framework for Good Cybersecurity Practices for AI reinforce comparable expectations on incident traceability. The artefacts the regulator will request (per-agent audit trail, reasoning trace, blast-radius reconstruction) are the outputs of steps 2 and 3.

Step 6: selective re-enable with degraded-mode guardrails. The agent returns with its action surface narrowed, not with full authority restored. The action class that broke stays paused or moves behind an additional human-in-loop gate. The reasoning-trace retention pipeline is verified. The kill-switch primitives are tested. The detection thresholds are tuned to the signal that preceded the incident. The NIST AI RMF Generative AI Profile (NIST AI 600-1) provides the control vocabulary; the Anthropic Responsible Scaling Policy disclosures and OpenAI’s system-card publication pattern are public-domain models for what a postmortem of a model-side incident reads like.

Steps are sequential in the canonical case but overlap in practice. Steps 1 and 5 frequently run in parallel; steps 2 and 3 are interleaved; step 6 is the most-skipped step in incidents that get declared resolved before the underlying control work is done.

Reflect: what’s different from traditional SRE

The structural difference comes down to four properties of the failure mode. Non-binarity: an agent can be wrong subtly, partially, intermittently, and standard detection thresholds (latency, error rate, availability) do not fire on the Zillow-class slow drift. Reasoning-dependence: two invocations with the same prompt can produce different outputs, so reproduction (a foundational step in standard SRE diagnosis) is unreliable and the investigation has to work from the reasoning trace. Downstream propagation through other agents: blast radius extends to every consumer of the bad output, and the propagation graph is hard to enumerate without per-agent identity infrastructure. Liability that does not transfer: Air Canada closed the option of treating the agent layer as a third-party shield.

Each property has a partial answer in an adjacent discipline (model-risk management for non-binarity, software-supply-chain practice for downstream propagation, vendor-management for liability). None covers all four. The bundling is what makes the playbook editorially useful.

Share thoughts: a runbook template a CISO can take to the team

The six-step response sequence above is operational only if the pre-incident artefacts are in place. The minimum-viable bundle is five pre-incident artefacts and four post-incident artefacts; the response runs improvisationally without them.

Pre-incident artefacts (ship before the first incident):

  1. Action-class containment registry per agent. Every action class the agent can take, with the technical primitive that pauses each class without taking the agent offline. Reviewed quarterly.
  2. Reasoning-trace retention pipeline. Prompts, tool calls, intermediate conclusions, and final actions captured for at least 90 days, queryable by action ID. Retention extended to the financial-records horizon for transaction-bearing deployments.
  3. Agent-to-agent dependency graph. Which agents consume the outputs of which other agents. Updated as deployments change.
  4. Incident communication template per audience (customer, counterparty, regulator, board) with the agent-failure-mode vocabulary populated and legal-counsel review pre-cleared.
  5. Regulatory-exposure matrix. Which deployments are in-scope of which reporting obligations, with contact path and timeframe per obligation.

Post-incident artefacts:

  1. Postmortem in the standard SRE format with added sections for failure-mode classification, reasoning-trace summary, and blast-radius graph.
  2. Control review. Did containment work? Did the reasoning trace retain what was needed? Did the dependency graph accurately predict the propagation?
  3. Detection-threshold update. The signal that preceded this incident is now part of the monitoring baseline.
  4. Communication archive. Notifications sent, regulator filings made, responses received. Retained for the financial-records horizon.

A CISO who can hand a deployment team this runbook on Monday and have it operational within 6 to 8 weeks is in the small minority of 2026 enterprise positions. For most enterprises this is a Q2 2026 build task rather than a steady-state capability, which is the point.

Holding-up note

The primary claim of this piece (that the right enterprise playbook for an agent incident in 2026 has six steps that do not appear in any standard SRE handbook, and that CIOs without this playbook will spend their first agent incident discovering it under crisis conditions) is logged at AM-111 on the Holding-up ledger on a 60-day review cadence. Three kinds of evidence would move the verdict:

  • A major published agent-incident postmortem from an enterprise that documents a structurally different response shape. Anthropic and OpenAI publish model-side postmortems; an equivalent on the enterprise-deployment side is what would test the playbook against operational reality.
  • EU AI Act Article 73 enforcement actions after the August 2026 window opens. The first batch of incident-reporting filings will reveal what market-surveillance authorities actually request, and whether the artefacts the six-step playbook produces are sufficient evidence.
  • Standards-body or regulator publication of an explicit agent incident-response framework. NIST AI RMF revisions, the OWASP Agentic AI Top 10, CISA updates, and ISO/IEC AI security standards are the likely sources. A standards convergence on a comparable framework would confirm the structural framing; a divergence would weaken it.

The next review of this claim is scheduled 28 June 2026.

ShareX / TwitterLinkedInEmail

Correction log

  1. 29 Apr 2026Initial publication 29 Apr 2026. Initial verdict 'Partial' — six-step playbook is a synthesis from current SRE practice + AI-specific guidance and has not been tested against a major published agent-incident postmortem yet. REVIEW: Peter — please verify claim text + cited primary sources before removing rewriteInProgress flag.

Spotted an error? See corrections policy →

Disagree with this piece?

Reasoned disagreement is a first-class signal here. Every review cycle weighs documented dissent; material dissent becomes part of the article's change history. This is not a corrections form — use /corrections/ for factual errors.

Part of the pillar

Agentic AI governance

Governance frameworks, oversight patterns, and compliance postures for enterprise agentic-AI deployment. 39 other pieces in this pillar.

Vigil · 76 reviewed