Skip to content
Vigil·last review 18h ago·next review cycle 19 May 2026

Every claim this publication has made, and whether it still holds.

The point of writing about enterprise AI is to be right for longer than a news cycle. This page tracks every argument published here, reviewed on a 30–90 day rhythm. If something stops holding, it's marked and the piece is annotated. Nothing is quietly removed.

20holding
03partial
00not holding
StatusClaimNext review
Holding

AM-020 · pub 31 Jul 2025 · rev 19 Apr 2026

The 40-60% TCO underestimate on enterprise agentic-AI deployments is not a cost-visibility failure — it is a cross-departmental cost-attribution failure. Integration, tokens, maintenance, supervision, and compliance costs land on IT, HR, and Legal budgets that do not reconcile in most organisations, so the CFO sees the bill late and partial.

Based on 2026 CFO-guide data: €368K vs €158K naive estimate, 40-60% TCO underestimate, 73% exceed by 2.4x, 15-20%/year maintenance, supervision tax in thousands/month, 70% failure from change management. Watching for a Big 4 TCO framework or enterprise CFO survey that resolves the cross-departmental framing.

+54dnext review
Partial

AM-014 · pub 3 Aug 2025 · rev 19 Apr 2026

The ~73% of enterprise agentic-AI projects that fail share three structural gaps — no named owner, scope drift, and missing agent-level MTTD — and the 27% that succeed cluster around the inverse.

Backfilled claim. Body predates current editorial standard; spine holds, per-claim fact-check deferred to first review cycle.

+54dnext review
Partial

AM-016 · pub 27 Jul 2025 · rev 19 Apr 2026

Agent-mediated network management reduces unplanned firewall-change incident costs only when the agent's action log feeds into the same change-management audit trail human changes use — not as a parallel system.

Backfilled claim. Body predates current editorial standard; spine holds, per-claim fact-check deferred to first review cycle.

+54dnext review
Holding

AM-032 · pub 24 Apr 2026 · rev 24 Apr 2026

EU financial-services agentic AI deployments operate under a compounded five-framework obligation surface (DORA, NIS2, MiFID II, EU AI Act, GDPR) that sits on top of general AI governance. Liability does not transfer to the vendor contractually regardless of SLA language — MiFID II conduct rules, EU AI Act deployer obligations, and DORA third-party-risk provisions place customer-facing and regulator-facing liability on the deploying financial institution. Compliance-posture and vendor-lock-in are the dominant GAUGE dimensions for the sector, scoring 15-25 points lower than cross-industry averages on first pass.

First piece in planned vertical-industry series. Cluster G anchor. 60-day review cadence. Watches: (1) major ESA (EBA/ESMA/EIOPA) publishing agentic-AI-specific guidance, (2) DORA or EU AI Act enforcement action redefining liability-transfer boundaries, (3) industry-body vendor contract templates closing DORA third-party-risk gap.

+59dnext review
Holding

AM-031 · pub 24 Apr 2026 · rev 24 Apr 2026

The CMU TheAgentCompany 2026 benchmark figure (30.3% task completion for best-in-class frontier model, up from 24% in 2024) is the current capability constraint for enterprise agentic AI. Capability trajectory projects to ~40% by late 2027, which does not cross the 95% production-readiness threshold within the 3-year TCO horizon enterprise business cases operate against. The Stanford DEL 12% durable cohort operates within the 30.3% (narrow scope + human-in-the-loop + GAUGE-dimensional governance discipline), not around it. Capability is not the variable that separates the 12% from the 88%.

Third of three claim-archive signature pieces (after AM-029 Stanford 88% and AM-030 McKinsey 23%). 60-day review cadence. Watches: (1) frontier model crossing 50% on TheAgentCompany without corresponding deployment-pattern change, (2) cross-enterprise analyses showing capability-wait deployments equivalent to governance-discipline deployments, (3) benchmark refresh shifting the easy/medium/hard distribution such that more of the enterprise task space lands in the viable scope envelope.

+59dnext review
Holding

AM-030 · pub 24 Apr 2026 · rev 24 Apr 2026

The McKinsey State of AI 2025 figure (23% of enterprises scaling an agentic AI system, 39% still experimenting) is an operational-preconditions outcome, not a technical-readiness outcome. Four preconditions (agent registry, measured pre-deployment baseline, differentiated change-management playbook for adjacent units, cross-agent threat model at scale) separate pilots that cross into production from pilots that stall. The 6% AI-high-performer segment is the subset of the 23% scaling with additional measurement discipline that makes ROI audit-survivable.

Claim-archive signature piece analysing McKinsey State of AI 2025 (ANA-2026-006). Cross-validated against Stanford DEL ACA-2026-003, Gartner ANA-2026-001/002, CMU ACA-2026-004. 60-day review cadence. Watches: (1) subsequent large-sample datasets showing 23% and 6% compressing toward 39% experimenting, (2) cross-enterprise analyses disproving the preconditions framing, (3) analyst frameworks converging on preconditions-style framing.

+59dnext review
Holding

AM-029 · pub 24 Apr 2026 · rev 24 Apr 2026

The 12/88 bimodal distribution in enterprise agentic AI ROI realisation (Stanford DEL 2026 + cross-validated by Gartner, McKinsey, CMU) is a governance-discipline outcome, not a model-capability outcome. The 12% instrument the six GAUGE dimensions on a 90-day review rhythm; the 88% treat governance as a deliverable to the audit committee. Capability gap (CMU's 30.3% best-in-class task completion) constrains what is possible, not what separates the 12% from the 88%.

Signature piece framing. 60-day review cadence. Watches: (1) a frontier-model generation collapsing the 88%/12% gap without governance change, (2) cross-enterprise studies showing dimensional scoring models don't predict deployment outcomes, (3) regulatory frameworks evolving to score deployment quality beyond risk-tier classification.

+59dnext review
Holding

AM-028 · pub 24 Apr 2026 · rev 24 Apr 2026

Partner — co-development with a vendor on a structured non-standard engagement — is structurally under-chosen in enterprise agentic AI procurement in 2026. Procurement committees have templates for build and buy but none for partner, so the third path does not get evaluated on an equal footing. The vendor-lock-in and change-management dimensions of the GAUGE framework usually favour partner when it is honestly evaluated, not buy or build.

Claim scoped to enterprise agentic AI procurement specifically. 60-day review cadence. Watches: (1) aggregate analyses showing partner outcomes statistically indistinguishable from buy, (2) major consultancies adopting three-path templates (Gartner, Forrester, McKinsey), (3) regulatory procurement frameworks structuring partner-style engagements as a distinct third path.

+59dnext review
Holding

AM-027 · pub 24 Apr 2026 · rev 24 Apr 2026

A durable enterprise agentic AI business case requires three specific documents — a TCO model with ten named cost categories (not vendor-supplied line items), an ROI model with a pre-deployment measured baseline and an independent validation round, and a three-scenario risk-adjusted NPV. The single-scenario vendor-framed business cases that dominate 2026 enterprise AI investment committees are the predictable root of the 40%+ projected agentic AI project cancellation rate.

Claim scoped to enterprise agentic AI business cases specifically (not enterprise SaaS generally). 60-day review cadence. Watches: (1) studies showing single-scenario NPVs produce outcomes equivalent to three-scenario, (2) aggregate post-18-month audits reordering the anti-pattern ranking (e.g., compliance understatement dominant over vendor-TCO framing), (3) regulatory changes (EU AI Act review, NIST AI RMF updates) that materially shift compliance-cost dynamics.

+59dnext review
Holding

AM-026 · pub 24 Apr 2026 · rev 24 Apr 2026

Generic enterprise SaaS RFPs systematically underweight six agent-specific governance dimensions (governance maturity, threat model, ROI evidence, change management, vendor lock-in, compliance posture). A 60-question RFP layer mapped to the GAUGE framework materially changes vendor selection outcomes by disqualifying vendors whose operational governance will not survive the 18-month enterprise review cycle.

Claim scoped to enterprise agentic AI procurement specifically (not enterprise SaaS generally). 60-day review cadence. Watches: (a) anonymised procurement-committee case studies showing equivalent outcomes from generic RFPs, (b) vendor self-disclosure movements that obviate the RFP artifact, (c) regulatory procurement frameworks (EU AI Act Article 68 public-sector procurement) converging on similar dimensions.

+59dnext review
Holding

AM-025 · pub 24 Apr 2026 · rev 24 Apr 2026

Enterprise agentic AI governance in 2026 fails at the operational layer even when it passes at the compliance layer. Boards receive EU-AI-Act-mapped compliance decks while the agentic deployments actually shipping out of IT ops have no measurable overlap with that deck. Durability requires six instrumented dimensions scored 0–100 (GAUGE framework) with a 90-day setup cadence and a 12-month trajectory target — not a compliance matrix.

Based on April 2026 corpus review of published governance-framework deployments + post-cutover analysis of the 88% failure rate (Stanford DEL ACA-2026-003), the 28% I&O pay-off rate (Gartner ANA-2026-002), and the 40% projected cancellation rate (Gartner ANA-2026-001). 60-day review cadence with explicit watches on (a) cross-enterprise studies testing dimensional scoring's predictive power, (b) analyst firms adopting similar instrumented-dimension models, (c) regulatory frameworks evolving to score deployment quality vs only classify risk tier.

+59dnext review
Holding

AM-023 · pub 23 Aug 2025 · rev 19 Apr 2026

The 10 Apr 2026 Google AI Mode rollout to eight markets is the first vertical (restaurant booking) where agentic search reduces named SaaS aggregators (OpenTable, TheFork, ResDiary and five others) to API backends rather than destinations. The template applies to every enterprise-relevant aggregation vertical — business travel, expense management, procurement, ATS, HR service delivery — and incumbents in those verticals have 18-24 months to pick API-backend or destination positioning before agentic search forces the choice.

Based on Google's 10 Apr 2026 rollout (8 markets, 8 partner platforms), Semrush + ppc.land + WinBuzzer coverage, the OpenTable/Reserve-with-Google integration pattern. Review cadence is 60 days with explicit watch on whether a second vertical agentic-search rollout lands before end-2026.

+54dnext review
Holding

AM-024 · pub 20 Apr 2026 · rev 20 Apr 2026

Enterprise-AI decisions in 2026 are made on a citation chain nobody in the chain verifies. The infrastructure gap CIOs face is a verification layer for the claims their procurement runs on — not an information gap. The 88% failure rate in enterprise agentic AI is the predictable output of decision-making on unverified citations, not a capability problem.

Based on 2025-2026 observation of vendor-claim → analyst-note → trade-press → CIO-deck citation chains. Stanford DEL 12/88 bimodal + Gartner 7 Apr 2026 28% I&O pay-off as anchoring evidence. 60-day review cadence with explicit watches on (a) third-party verification infrastructure emerging, (b) RFPs requiring citation-review schedules, (c) our own archive's Weakened-verdict rate.

+55dnext review
Holding

AM-018 · pub 19 Jul 2025 · rev 19 Apr 2026

Agentic AI's compounding economics show up in back-office operations (AP, IT ticket triage, HR onboarding, procurement, close-cycle reconciliation), not in front-office customer-facing workflows. The 12% of deployments that clear 300%+ ROI cluster there for structural reasons: per-action savings × action frequency × task-specification tightness × existing process instrumentation.

Based on Stanford DEL 2026 bimodal distribution (12%/88%), Gartner Q1 2026 28% pay-off rate, OneReach 2026 171% average, Futurum 71% operational median vs 40% high-automation. Anthropic AP-processing + Salesforce tier-1 support + Microsoft Copilot-Dynamics as back-office case anchors. 60-day review for counter-evidence watch.

+54dnext review
Holding

AM-017 · pub 19 Jul 2025 · rev 19 Apr 2026

Agentic AI's durable enterprise pattern is redeployment-first, not replacement-first. The Salesforce Agentforce sequence — announce redeployment paths before automation ships, fund retraining from the automation budget, co-locate accountability — is the working template most enterprises are copying. Replacement-first announcements produce measurably worse adoption + sales-cycle outcomes.

Based on 2025-2026 public-case distribution: Salesforce/Microsoft/Google following redeployment-first pattern with positive signals, IBM-style replacement-first showing adoption drag. Stanford DEL 2026 + Gartner Q1 2026 as analytical anchors. 60-day review cadence because workforce-transition frames can shift quickly with any major public reversal.

+54dnext review
Holding

AM-013 · pub 19 Apr 2026 · rev 19 Apr 2026

Q1 2026 is the quarter enterprise agentic-AI crossed three thresholds simultaneously — the first at-scale in-the-wild exploits, the first vendor-shipped governance infrastructure, and the first hard ROI data — and programmes designed around only one will not make the 28% that pay off.

60-day cadence because the Gartner Q2 I&O update lands inside the window. Secondary interpretation (that Q1 governance frameworks are shaped by EU AI Act compliance requirements first and threat-model completeness second) is reviewable alongside the primary claim.

+54dnext review
Holding

AM-003 · pub 19 Apr 2026 · rev 19 Apr 2026

GPT-5 Pro's tiered-subscription model forces enterprises to classify problems by computational difficulty — $200/month premium routing only repays for the top decile of 'very hard' queries.

Claim created at publish; review in 30 days — pricing-tier claims are highly time-sensitive. Verify $200/month Pro tier availability and Claude Opus comparison pricing monthly.

+24dnext review
Holding

AM-002 · pub 19 Apr 2026 · rev 19 Apr 2026

Agentic AI's $3.50-per-dollar average return masks a 70% task-failure rate on the Carnegie Mellon benchmark; only narrowly-scoped deployments clear the reality bar.

Claim created at publish; review in 60 days. Re-verify Carnegie Mellon agent-completion benchmark + IDC $3.50 ROI number against next round of publications.

+54dnext review
Holding

AM-001 · pub 19 Apr 2026 · rev 19 Apr 2026

70% of AI-implementation failure is people and process, not technology — cultural transformation is the strongest predictor of AI ROI at the 2024-2025 maturity stage.

Claim created at publish; review in 60 days. BCG + McKinsey 2024-2025 data; re-verify 70% people-process split against Q4 2026 McKinsey MGI update.

+54dnext review
Holding

AM-021 · pub 16 Aug 2025 · rev 19 Apr 2026

The 87% vs 27% success-rate gap between Six-Sigma and non-Six-Sigma organisations on agentic-AI deployments reflects pre-existing measurement discipline, not the DMAIC methodology itself. Agents require a clean baseline, defect definition, documented root-cause analysis, and a change-management gate — four conditions that ISO 9001, ITIL, SRE, or HACCP practices produce just as reliably.

Based on Gravitex 87%/27% split, LuckiWi's 82% of Fortune 100 using Six Sigma, Gartner's 7 Apr 2026 finding that 57% of failed I&O deployments cited 'too much too fast'. Claim reframes the causal arrow: the pre-built measurement environment is what matters, Six Sigma is one path that produces it.

+54dnext review
Partial

AM-015 · pub 1 Aug 2025 · rev 19 Apr 2026

An agentic-AI Center of Excellence justifies its overhead only after the organisation has three production agents running; before that, it over-governs an experimental footprint.

Backfilled claim. Body predates current editorial standard; spine holds, per-claim fact-check deferred to first review cycle.

+54dnext review
Holding

AM-022 · pub 06 Aug 2025 · rev 19 Apr 2026

The 171% average ROI on enterprise agentic-AI deployments is the mean of a bimodal distribution — roughly 12% of deployments clear 300%+ and 88% sit at or below break-even. The single factor distinguishing the clusters is not a multi-pattern framework; it is whether business-line (not IT) ownership held the kill-switch and accountability before the deployment shipped.

Based on Stanford DEL's 2026 playbook (51 deployments), OneReach 171% average + Futurum 71% median productivity vs 40% high-automation, Gartner's 28%-pay-off finding on the 88% side. Watches for benchmarks that show the distribution tightening around the mean or counter-evidence of IT-led 300%+ deployments.

+54dnext review
Holding

AM-019 · pub 01 Aug 2025 · rev 19 Apr 2026

Manufacturing deployments hitting the 30% unplanned-downtime-reduction benchmark share one architectural pattern — the agent writes its actions into the plant's existing MES/CMMS audit trail rather than a parallel log. Parallel-log deployments underperform by a factor of 2-3.

Based on the 2026 case-study spread (47-facility global manufacturer at 42% downtime reduction, pharma at 30% in six months, industry median 25-30%). Watching for a parallel-log deployment clearing 30% sustained over 12 months.

+54dnext review

Each claim links to the piece it came from and the review cadence Peter set when publishing it. How this works →