Resource · Operations · RES-003
The Agent Incident Runbook — detect, contain, roll back, post-mortem
A four-phase runbook for agent-mode AI incidents: detection within 4 hours, containment within 30 seconds of confirmed harm, rollback procedures for the seven action classes agents typically take, and a structured post-mortem template aligned to MTTD-for-Agents. Built for SRE teams who already run a standard incident response process and need the AI overlay.
- Version
- v1.0
- Last reviewed
- 4 May 2026 · today
- For
- SRE leads, platform engineering, security operations
- Time
- 60–90 min to baseline against your stack
The standard SRE incident response runbook assumes the actor is a system you control: a deployment, a database, a service. An agent-mode AI incident inverts that assumption. The actor is a system that took an action you did not directly authorise, against an external surface, possibly while you were asleep. The standard runbook still works for the surrounding mechanics; what it does not handle is the specific question of how to halt an autonomous agent and reverse what it did between the time it acted and the time you noticed.
This runbook is the AI overlay on standard SRE incident response. It assumes you already have a paging rotation, an incident commander role, a status page, and a post-mortem culture. It adds the four-phase agent-specific procedure: detection within 4 hours, containment within 30 seconds of confirmed harm, rollback for the seven action classes agents typically take, and a structured post-mortem aligned to the MTTD-for-Agents methodology.
Phase 1: Detection (target: 4 hours from agent action to operator awareness)
Detection is the failure point in most agent-mode incidents observed in 2025. The mean time to detect across the available case studies is closer to 14 days than 4 hours. The reason is structural: agents act in single transactions that look identical to authorised activity in the audit log. Without a tripwire designed to fire on agent-distinct patterns, the incident does not appear until a downstream consequence (cost spike, customer complaint, regulatory notice) surfaces it.
The four tripwires that consistently catch agent incidents within the 4-hour window:
- Cost-rate tripwire. Per-agent spend exceeding 2× the rolling-7-day median in any 1-hour window. Pages the agent owner.
- Action-rate tripwire. Per-agent API call rate exceeding 3× the rolling-7-day median, OR call rate against a previously-unused endpoint. Pages the agent owner.
- Outcome tripwire. Customer-visible artifact created by an agent (email sent, document published, ticket closed) where the human review step was bypassed or completed in under a threshold time. Pages the operations queue.
- Authorisation tripwire. Agent invocation of a tool or API not in its declared inventory. Pages the security team.
A subset of incidents will surface only via downstream consequences (regulatory complaint, partner escalation, public disclosure). For these, the detection clock starts at consequence-surface, not agent-action; the post-mortem flags the missing upstream tripwire.
Phase 2: Containment (target: 30 seconds from confirmed harm to all-agents-halted)
Containment for an agent incident is binary: either the kill-switch fires within 30 seconds, or it does not. There is no graceful drain because the action being prevented is the next agent invocation, not the completion of a long-running request.
The kill-switch must be operationally accessible to at least three roles: the on-call SRE, the security on-call, and the business owner of the affected agent. If only one role can fire it, you have a single point of failure during the worst possible incident class. The 30-second target is the SLA the customer-side party should be able to achieve; the vendor-side kill-switch SLA is a separate question (covered in section 5 of the AI Vendor Security Questionnaire).
Containment decisions in priority order:
- All-agents-halt. Use when the affected agent has access to high-blast-radius tools (payment, customer communication, production database write) and the failure mode is unclear. Default to all-halt; restore selectively.
- Per-agent-halt. Use when the affected agent is identified and isolated, and other agents in the system are demonstrably unaffected.
- Per-tool-halt. Use when the failure mode is a specific tool the agent is calling incorrectly. Halt the tool; agent continues with reduced capability.
- Per-action-quarantine. Use when the agent’s actions are queued for human approval rather than executing directly. Drain the queue; assess each pending action.
Containment is not the end of the incident. It is the moment the incident timer starts on rollback.
Phase 3: Rollback (procedures by action class)
Agents in 2026 take actions across roughly seven classes. Each class has a different rollback profile, and the runbook needs a procedure for each. Rollback is not always possible; where it is not, the runbook must specify the substitute (compensating action, customer notification, regulatory disclosure).
Class 1: Database writes. Standard transactional rollback if the database supports it; otherwise compensating writes. The runbook needs the specific schema and the operator who can authorise the rollback. Time budget: 4 hours from containment.
Class 2: External API calls (payments, bookings, transfers). Idempotent reversal where the API supports it (Stripe refund, Calendly cancel). Where it does not (one-way payment rails, webhook fire-and-forget), the runbook needs the customer-communication template and the financial reserve to absorb the loss. Time budget: 24 hours.
Class 3: Customer communications (emails sent, messages posted). Sent communications cannot be unsent. The runbook needs the follow-up communication template, the legal review SLA for the follow-up, and the impacted-customer list extraction procedure. Time budget: 4 hours for follow-up dispatch.
Class 4: Document creation or publication. Internal documents: delete or mark as superseded. Public documents (blog posts, social media, generated landing pages): unpublish, then assess SEO and link-equity damage; consider 410-Gone vs 301 to a corrections page. Time budget: 1 hour for unpublish, 24 hours for SEO remediation.
Class 5: Code commits or deployments. Standard git revert if the commit is not yet deployed. If deployed: standard rollback procedure plus an audit of every system that ran against the deployed version during the window. Time budget: 30 minutes for revert, 4 hours for audit.
Class 6: Identity or access changes. Provisioned access can be deprovisioned; the audit question is what was done with that access during the window. The runbook needs the access-log query and the incident-extension procedure if downstream actions are discovered. Time budget: 30 minutes for revocation, 24 hours for downstream audit.
Class 7: Knowledge-base or vector-store writes. Embeddings, retrievals, fine-tune dataset additions. Removal is procedural (delete the records); the audit question is what other agents have already retrieved from the corrupted state. Time budget: 4 hours for deletion, 7 days for downstream-effect monitoring.
For each action class, the runbook captures: the rollback procedure, the operator authorised to execute it, the time budget, the substitute action where rollback is impossible, and the verification step confirming rollback succeeded.
Phase 4: Post-mortem (MTTD-for-Agents structure)
Standard SRE post-mortems work for agent incidents with one addition: the MTTD-for-Agents detection chain. The detection chain answers the question of why detection happened when it happened, and what would have to change for the next instance of this failure class to be detected within the 4-hour target.
The detection chain has five phases:
- Action. When did the agent take the harmful action?
- Signal. When did the system that should have detected it receive the signal?
- Trigger. When did the tripwire fire?
- Page. When did a human get paged?
- Acknowledgement. When did the human start working the incident?
The MTTD is the elapsed time from phase 1 to phase 5. Each gap between phases is its own root cause. A 14-day MTTD with the entire delay between Action and Signal points to missing instrumentation. A 14-day MTTD with the entire delay between Signal and Trigger points to a tripwire that did not fire. A 14-day MTTD with the entire delay between Page and Acknowledgement points to alerting hygiene. The chain forces the post-mortem to identify which gap to close first.
The post-mortem also captures:
- Blast radius. Who or what was affected, in measurable terms (records, dollars, users, regulatory surface).
- Cost. Financial impact, time impact, opportunity impact.
- Vendor implication. Did the vendor’s documented kill-switch SLA hold? If not, this becomes contract-renegotiation material.
- Tripwire delta. What new tripwire would have caught this 4 hours earlier? Add it to the detection layer; verify it would have fired against the incident timeline.
- Runbook delta. What rollback procedure was missing or wrong? Update the action-class table.
- Communication record. Who was notified, when, with what message. Particularly: customers, regulators (where mandatory under EU AI Act Article 26 serious-incident reporting), the board (where threshold-triggered).
- Recurrence prevention. The single change that, if made, would prevent this exact incident from recurring. Owner and date.
Test the runbook quarterly
A runbook that is not tested is not a runbook; it is a document. Quarterly fire-drills against the runbook verify three things: the kill-switch still works, the on-call rotation knows the procedure, and the rollback procedures still match the agent action surface (which evolves as new agent capabilities are deployed).
The drill format that works is a tabletop with the agent-system on-call rotation, presented with a synthetic incident scenario chosen from a rolling list. The exercise confirms that the runbook still reflects reality and surfaces the parts that have drifted since the last test.
Versioning and review
This runbook is on a 60-day review cycle, faster than the other tools in the resources catalog because the agent action surface is moving faster than the regulatory surface. Section 3 (rollback procedures by action class) will gain new classes as agent capabilities expand into new categories (notably: code-execution agents that ship to production directly, and agents that authorise other agents).
Spotted a missing action class, an unrealistic time budget, or a rollback procedure that does not survive contact with your stack? The corrections policy applies.
RES-003holdingsince 4 May 2026Tracked atRES-003 →
The analysis behind this
- ai-it-operations-reality-check · Reporting
- non-human-identity-ai-agents · Reporting
Spotted an error? See corrections policy →