Skip to content
This piece was written by Claude (Anthropic). Peter set the brief, reviewed the sources, and signed off on publication before it went out. Why we work this way →
AM-027pub24 Apr 2026rev24 Apr 2026read10 min
Business Case & ROI

The CFO's agentic AI business case: TCO and ROI

Most agentic AI business cases fail audit. Three documents survive: TCO with named components, ROI with pre-deployment baseline, scenario-weighted NPV.

Holding·reviewed24 Apr 2026·next+59d
CFO business case cover. A structured finance spreadsheet layout showing three stacked blocks: TCO (cost line items), ROI (baseline vs post-deployment), Scenarios (conservative/base/optimistic). Right column shows NPV band. Footer: Three documents. One audit-survivable case.
CFO business case cover. A structured finance spreadsheet layout showing three stacked blocks: TCO (cost line items), ROI (baseline vs post-deployment), Scenarios (conservative/base/optimistic). Right column shows NPV band. Footer: Three documents. One audit-survivable case.

Most agentic AI business cases don’t survive audit. A vendor estimate gets dropped into a single-scenario NPV, the investment committee approves, and eighteen months later the actual spend is 2.3× the original number and the productivity lift is a vendor-supplied talking point with no pre-deployment baseline. Gartner’s 2025 projection that 40%+ of agentic AI projects will be cancelled by end of 2027 is the aggregate read-out of that dynamic.

The 12% of deployments that Stanford Digital Economy Lab’s 2026 data shows clearing 300%+ ROI run a different operating model. Their business cases contain three specific documents — a Total Cost of Ownership model with named components, an ROI model with a pre-deployment measured baseline, and a three-scenario risk-adjusted NPV. None of those are novel financial techniques. What separates the 12% from the 88% is discipline about filling them in honestly before contract signature, not after.

This piece walks the three documents, gives per-component guidance, lists the specific anti-patterns that cause business cases to fail audit, and points to a downloadable Excel template with every line item pre-structured and the NPV computation pre-wired. It’s a complement to the CFO’s guide to true TCO and ROI modeling, which the audience for this piece has likely already read and acted against the GAUGE ROI-evidence dimension.

Why most agentic AI business cases fail audit

Three recurring failure modes show up when an 18-month-old agentic AI deployment is re-reviewed against its original business case:

The TCO model was built from the vendor’s line items. Vendor-supplied TCO worksheets are calibrated to make the licensing fee look like the dominant cost, because licensing is what the vendor charges. They systematically underweight the other seven-to-nine line items — implementation cost, change management, internal staff time, integration work, ongoing operations, security review, compliance evidence, observability tooling, and decommissioning reserves. McKinsey’s State of AI 2025 data, showing only 6% of enterprises fall into the “AI high performer” bucket attributing >5% of EBIT to AI, cross-references that gap: the high-performers are the enterprises whose TCO model was built internally, with vendor licensing as one of ten columns rather than the spine.

The ROI claim has no pre-deployment baseline. A post-deployment statement like “our support agents save 5,000 hours per week” has two structural problems. First, the baseline (how many hours the equivalent human process cost) isn’t measured, only estimated. Second, the “saved hours” figure often counts end-user-initiated automations that would have happened anyway (e.g., users searching a knowledge base, which doesn’t require an agent). Under audit, the 5,000-hour figure compresses to 2,000 hours of genuine displacement. That’s still meaningful; it’s not 171%+ ROI.

The NPV is a single scenario, not a risk-adjusted distribution. Single-scenario NPV implicitly assumes vendor pricing stable, regulatory posture stable, threat surface stable, adoption linear. None of those hold in enterprise agentic AI in 2026 — aggressive consolidation is underway, compliance requirements are shifting every quarter, the attack surface is expanding via new exploit classes, and adoption varies wildly by target-user cohort. A scenario-free NPV is not a forecast; it is a marketing number.

The template below replaces each of those failures with a specific structural discipline.

Document 1 — the TCO model

A durable TCO model covers ten cost categories across a three-to-five-year horizon. Vendor licensing is one of ten; in well-governed deployments it is rarely the largest.

The ten categories, in the order a finance team typically encounters them:

  1. Platform licensing — the vendor’s stated list price per seat or per call, with volume tiering, pre-committed minimums, and the contract’s rate-change provisions made explicit.
  2. Foundation-model pass-through — if the platform bills LLM token usage separately or marks it up, this is a distinct line. Treat as a variable cost tied to adoption volume; do not roll into licensing.
  3. Infrastructure — compute, vector-store, observability tooling, log retention. Scales with adoption.
  4. Implementation — system integrator hours, internal engineering time, prompt engineering, evaluation infrastructure. One-time plus ongoing for each new agent.
  5. Change management — training content, delivery hours, LMS integration, adoption measurement. A frequently-omitted category; often represents 15–25% of true year-one TCO.
  6. Security review — threat modeling per deployment (the MTTD-for-Agents layer), penetration-testing, red-team engagements, incident-response tooling.
  7. Compliance evidence — documentation, third-party assessments, audit-trail retention infrastructure. Rising steeply under the EU AI Act and NIS2 for applicable sectors.
  8. Ongoing operations — monitoring, tuning, escalation handling, prompt-library curation, model-version migration when the vendor deprecates.
  9. Vendor-lock-in reserve — a provision for the engineering cost to switch vendors if the contract’s exit triggers fire. Finance teams comfortable with cybersecurity reserves should be equally comfortable with this one. 10–15% of three-year platform-plus-foundation-model spend is a reasonable order of magnitude.
  10. Decommissioning — when the agent is retired, data-export, migration to a successor system, closeout documentation. Often zero in year one; non-trivial by year three.

Categories that should not appear as separate TCO line items because they are usually miscounted: “productivity gain offset” (that belongs in ROI, not TCO), “reduced headcount” (same — it is an ROI output, not a cost), and “free from vendor as part of negotiation” (negotiated credits belong in the TCO base with a footnote, not outside it).

Document 2 — the ROI model

The ROI model has a pre-deployment baseline, a documented measurement method, and an independent validation round. Absent any of the three, the ROI number is a forward-looking statement of vendor confidence, not a finance-survivable projection.

The pre-deployment baseline. For a specific named business process that the agent will touch, measure — before the agent is deployed — the current-state: volume (per week or per month), time per unit of work, error rate, customer- or employee-satisfaction index, escalation rate. Two weeks of measurement is usually enough; six weeks is robust. Without this, there is nothing to subtract the post-deployment state from.

The measurement method. Documented as a one-pager: what is measured, who measures it, at what cadence, against what control group (or ideally an A/B holdout if the process supports it). Crucially, the method is set before deployment, not reverse-engineered to fit whatever number comes out. Reverse-engineered measurement methods are the single most common reason agentic AI ROI claims fail audit.

The independent validation. At least one validation round by someone outside the deploying team — internal audit, an external consultancy, a different business unit’s finance partner. The validation doesn’t need to be adversarial; it just has to be done by someone whose incentives are not aligned to produce a favourable number. Per the GAUGE ROI-evidence dimension, this step is what moves a deployment from a score of 3 to a score of 5 on that axis.

The output of the ROI model is a distribution of net-benefit estimates, not a point number. Enterprise finance organisations routinely handle scenario analysis for capital projects; agentic AI should not be an exception.

Document 3 — scenarios and risk-adjusted NPV

Three scenarios over the TCO-horizon (typically three years):

  • Conservative. Adoption at 40% of the initial target user population by month 12; vendor pricing escalating 15% annually; one material incident requiring 5–10% TCO-year one-off expense; one compliance documentation round hitting unscheduled in year two.
  • Base. Adoption at 70% of target by month 12; vendor pricing per contract; no material incidents; scheduled compliance refreshes on cadence.
  • Optimistic. Adoption at 90% of target by month 9; vendor pricing flat through year three (not common); no incidents; additional adjacent use cases captured mid-horizon.

Probability-weight the three scenarios (typical starting weights: 35% / 50% / 15%, though the weights should be calibrated to the enterprise’s observed project-execution history). The weighted NPV is the number the investment committee sees. The scenario-specific NPVs are what they debate.

A business case that presents only the optimistic NPV — which is a depressingly common pattern — is not a finance document; it is a marketing document asking finance for a signature. Finance committees who have been burned by this once in their career recognise the pattern immediately. The three-scenario model is the minimal tool that surfaces the optimism-bias before approval, not after.

The four anti-patterns that kill business cases

In addition to the three failure modes opening this piece, four anti-patterns show up repeatedly in post-18-month business-case audits:

Anti-pattern 1 — “Productivity gains” that can’t be banked. Reported hours-saved that do not translate to headcount actions, redeployment, or output increases. Finance-survivable ROI numbers are reconciled to either cost-avoidance that actually avoided cost (documented) or revenue-increase that actually landed (booked). Soft productivity gains are a leading indicator, not a banked return.

Anti-pattern 2 — Adoption projections that ignore cold-pocket cohorts. Aggregate adoption numbers hide the cohort pattern — whether adoption is 70% across all target users or 95% in one cohort and 30% in another. The cold-pocket cohort usually matters more, because that’s where change management is failing and where the ROI assumptions are weakest.

Anti-pattern 3 — Single-vendor exit assumptions. The base-case assumes the vendor will still be the vendor in year three at the contracted rate. In a market with 71 vendor-claim events this quarter alone and aggressive M&A, single-vendor-forever is a planning bias, not a forecast.

Anti-pattern 4 — Compliance cost understated. The compliance-posture dimension of GAUGE is the one that moves most predictably against the enterprise over time. The NIST AI Risk Management Framework and its Generative AI Profile are revised on a roughly 18-month cadence; each revision adds functions that translate into documentation work. The EU AI Act high-risk obligations, GDPR breach-notification requirements, NIS2 incident-reporting obligations, and sector-specific frameworks (DORA for financial services, HIPAA for healthcare) are adding requirements, not removing them. A business case that assumes compliance costs stable across a three-year horizon is systematically understating TCO.

How to present to the investment committee

The business-case presentation, in the order the committee will actually process it:

  1. One-page summary — the weighted NPV, the scenario distribution, the top three risk drivers, the recommendation.
  2. TCO walkthrough — the ten-line-item table for year one, year two, year three. Flag any line item where the number was sourced from vendor rather than internally measured.
  3. ROI walkthrough — baseline measurement summary, measurement method one-pager, independent validation status, net-benefit distribution.
  4. Scenarios — the three scenarios side-by-side, with the probability weights and the reasoning behind them.
  5. GAUGE score — the deployment’s current and projected GAUGE scores per the framework. Committees responsible for a portfolio of AI investments benefit from a comparable governance signal across projects.
  6. Decision ask — approve, approve with conditions (name the conditions), decline.

Committee meetings that follow this structure take 45–60 minutes per business case and produce decisions the committee can defend to audit 18 months later. Committee meetings that skip steps 2–5 and go straight to 1-and-6 produce decisions that look efficient and don’t survive the re-review cycle.

Download · the CFO business case Excel

Holding-up note

The primary claim of this piece — that a durable agentic AI business case requires three specific documents (TCO, ROI with pre-deployment baseline, three-scenario NPV), and that the anti-patterns above are the recurring causes of audit failure — is on a 60-day review cadence. Three kinds of evidence would move the verdict:

  • A large-scale study showing that agentic AI investment committees using single-scenario NPVs produce outcomes indistinguishable from three-scenario NPVs. Would weaken significantly.
  • Aggregate post-18-month business-case audits (from analyst firms or internal-audit communities) showing the anti-patterns listed here rank differently in practice — e.g., compliance understatement is the dominant failure, not vendor-TCO framing. Would force a reordering of the anti-pattern list without changing the three-document framing.
  • Regulatory changes (EU AI Act review cycles, NIST AI RMF updates) that materially shift the compliance-cost category’s dynamics. Would require a re-measurement of the anti-pattern 4 framing.

If any land, the Holding-up record for AM-027 captures what changed, dated. Original claim stays visible. Nothing is quietly removed.

ShareX / TwitterLinkedInEmail

Spotted an error? See corrections policy →

Part of the pillar

Enterprise AI cost and ROI

Verifying, tracking, and challenging the ROI claims vendors and analysts make about enterprise agentic AI. 8 other pieces in this pillar.

Related reading

Vigil · reviewed