What is the agentic AI discovery phase actually for?

Discovery is the period upstream of any vendor evaluation or procurement decision in which the procuring enterprise tests its own readiness to operate an agentic AI deployment, rather than testing the technology itself. The output of discovery is a binary readiness decision: clear the four upstream tests and the organisation is ready to enter procurement on a specific candidate workflow, or do not clear them and the productive next step is to fix the gaps before any vendor conversation begins. The most common 2026 enterprise discovery error is treating it as a vendor-evaluation sprint to a go-decision, which collapses two distinct decisions (organisational readiness and vendor selection) into one and makes both worse.

What are the four upstream tests that determine whether an organisation should proceed past discovery?

(1) Definitional clarity across the senior team: does the executive layer share an operational definition of an AI agent that distinguishes it from chatbot, RPA, and generative AI tools? Without this, the procurement deck and the operating reality drift apart. (2) A named operational candidate workflow with measured baseline and named owner: is there one specific workflow with a documented failure mode and an accountable owner that the deployment will own end-to-end? Without this, the procurement decision is made on vendor-supplied estimates rather than on the procuring enterprise's own evidence. (3) Threat-model literacy: does the security team understand the cross-agent prompt-injection class and the browser-resident agent class before the vendor demo, and can it articulate the compensating-control burden the deployment will require? Without this, the procurement decision inherits a residual risk the operator cannot defend at audit. (4) Workforce readiness: is the BCG 14% frontline AI-upskilling access gap being closed, or will the deployment land in a workforce that has not been prepared for the operational change? Without this, adoption stalls regardless of vendor or capability.

What does the Gartner January 2025 19/42/31/8 distribution actually describe?

Gartner polled 3,412 executives in January 2025 about their organisation's agentic AI investment posture. 19% reported having made significant investments; 42% reported conservative investments; 31% reported being in 'wait-and-see' mode; 8% reported no investments. The procurement-deck reading: the 19% + 42% = 61% engaged cohort is heterogeneous on readiness; some have cleared the four upstream tests, many have not, and the AM-030 distinction between 'experimenting' and 'scaling' applies to that 61% downstream of discovery. The 31% + 8% = 39% non-engaged cohort is not necessarily failing discovery; in many cases it is correctly identifying that the upstream tests are not yet cleared and that proceeding to procurement would land the organisation in the McKinsey 39% experimenting cohort or the 38% deployed-and-stopped cohort. 'Not yet' is a defensible discovery-phase outcome.

What is 'agent washing' and how does it shape the discovery phase?

Gartner's June 2025 release on the agentic AI cancellation projection includes a warning about 'agent washing': vendors rebranding existing products (RPA, chatbots, scripted automation) as 'agentic' without the autonomous-action capability that defines an agent in the analytical sense. IBM's Maryam Ashoori frames the definitional anchor: 'The true definition of an AI agent is an intelligent entity with reasoning and planning capabilities that can autonomously take action.' Vanderbilt's Jules White, on the Coursera Agentic AI for Leaders course, makes the operational version: agents take action (updating CRMs, scheduling meetings, executing trades), they don't just generate, they DO. The discovery-phase implication is that vendor demos which fail this test (no live demonstration of autonomous action; capability described in the slide-deck but absent in the live demo; reliance on constant human oversight at every step) belong outside the procurement evaluation regardless of how the product is labelled.

What is the right discovery-phase outcome for an organisation that does not clear the four tests?

The right outcome is to fix the upstream gap before re-entering discovery, not to proceed to procurement on a hope that the vendor will close the gap during deployment. A senior team without definitional clarity should run an internal workshop using the IBM and Vanderbilt anchors before any vendor conversation. An organisation without a named operational candidate workflow should select one and instrument the baseline before the procurement decision; the [CIO playbook five operational characteristics](/the-cios-playbook-orchestrating-human-ai-teams-that-actually-want-to-work-together/) describe what the candidate-workflow definition needs to include. An organisation without threat-model literacy should run an internal session on [the cross-agent class](/agentflayer-attack-why-chatgpt-copilot-6-major-ai-platforms-are-being-hacked-right-now/) and [the browser-resident class](/anthropics-claude-for-chrome-changes-everything-what-business-leaders-need-to-know-now/) before the security team's first vendor evaluation. An organisation with the BCG 14% workforce-access gap unaddressed should run [the workforce-readiness analysis](/the-56-solution-how-workers-are-turning-ai-anxiety-into-career-gold/) before authorising deployment scope. The discovery phase is the right place to surface and fix these gaps; procurement is not.

Agentic AI discovery: 4 tests upstream of procurement

At a glance

Claim

The agentic AI discovery phase upstream of procurement is not a vendor-evaluation sprint to a go-decision; it is an organisational-readiness test where the deciding question is whether the procuring enterprise can clear four upstream tests (definitional clarity across the senior team, a named operational candidate workflow with measured baseline and named owner, threat-model literacy on the cross-agent and browser-resident classes, and workforce-readiness against the BCG access gap) before any vendor conversation. Gartner's January 2025 poll of 3,412 executives (19% significant, 42% conservative, 31% wait-and-see, 8% no investment) describes the phase distribution; the 39% in 'wait-and-see' or 'no investment' postures are not failing discovery but correctly identifying that the upstream tests are not yet cleared.

Supporting figure

McKinsey's 'Seizing the agentic AI advantage' research describes a $2.7 trillion paradox: 80% of companies report using generative AI but without measurable bottom-line impact. Gartner's June 2025 prediction projects 40%+ of agentic AI projects will be cancelled by end of 2027. Gartner's January 2025 executive poll (n=3,412) places organisations in four discovery-phase postures: 19% have made significant agentic AI investments, 42% have made conservative investments, 31% remain in 'wait-and-see' mode, and 8% have made no investments. The discovery-phase question is not whether to proceed but whether the four upstream tests are cleared; the 39% in 'wait-and-see' or 'no investment' are not failing discovery but correctly identifying that the upstream conditions are not yet met.

Date

07 May 2026

Verdict

Holding(AM-004)

Next review

06 Jul 2026(+18d)

Bottom line. McKinsey’s ‘Seizing the agentic AI advantage’ research describes a $2.7 trillion paradox: 80% of companies use generative AI but report no bottom-line impact (McKinsey). Gartner projects 40%+ of agentic AI projects will be cancelled by end of 2027 (Gartner, 25 June 2025). Gartner’s January 2025 poll (n=3,412 executives) places organisations in four postures: 19% significant investment, 42% conservative, 31% wait-and-see, 8% none. The discovery phase upstream of procurement is an organisational-readiness test, not a vendor sprint. Four upstream tests determine whether the procuring enterprise should proceed at all, and ‘not yet’ is a defensible outcome for a meaningful share of organisations.

McKinsey’s research on agentic AI value capture surfaces the central paradox: 80% of companies report using generative AI, but the same population reports no measurable bottom-line impact at the firm level (McKinsey, “Seizing the agentic AI advantage”). The paradox sits at $2.7 trillion of potential value McKinsey identifies elsewhere in the same research thread. Gartner’s June 2025 forecast projects that 40%+ of agentic AI projects will be cancelled by end of 2027 (Gartner, 25 June 2025). The two numbers describe the same operational shape from opposite directions: many organisations engaged with agentic AI are not capturing value, and many that are mid-deployment will not finish.

This piece reads the discovery phase upstream of those outcomes. Discovery is the period in which the procuring enterprise decides whether to engage at all, evaluates its own readiness rather than the vendor’s product, and either proceeds to a specific procurement decision or returns to fix upstream gaps. The most common 2026 enterprise discovery error is treating discovery as a vendor-evaluation sprint to a go-decision; doing so collapses two distinct decisions (organisational readiness, vendor selection) into one and makes both worse.

What discovery actually has to test

IBM’s Maryam Ashoori frames the definitional anchor: an AI agent is an intelligent entity with reasoning and planning capabilities that can autonomously take action (IBM, 2025 expectations vs reality). Vanderbilt’s Jules White, teaching the Coursera “Agentic AI for Leaders” course, gives the operational version: agents take action (updating CRMs, scheduling meetings, executing trades), they do not just generate, they do.

The definitional anchor matters because it bounds what discovery is testing. The procuring enterprise is not testing whether large language models work; it is testing whether the organisation is ready to operate an autonomous-action system in production. The questions are different, the failure modes are different, and the procurement decisions that flow from each are different.

Four upstream tests determine readiness. They are organisational rather than technological, and they are answerable in writing before any vendor conversation begins.

Test 1: Definitional clarity across the senior team

The first test is whether the executive layer shares an operational definition of what an agent IS, distinct from chatbot, RPA, and generative AI tooling. Discovery sessions where the CFO is operating from one definition and the CTO from another produce procurement decisions that look aligned in the deck but diverge in operating reality.

The IBM and Vanderbilt anchors are the working definitions: autonomous action, reasoning, planning, end-to-end workflow ownership. Gartner’s January 2025 release named the parallel risk explicitly under the label “agent washing”: vendors rebranding RPA or chatbot products as “agentic” without the autonomous-action capability that defines an agent in the analytical sense (Gartner, 25 June 2025). Discovery without definitional clarity at the senior-team level is procurement-vulnerable to agent-washed product positioning by default.

The remediation is internal and short. A two-hour senior-team workshop using the IBM and Vanderbilt anchors, with a written one-page operational definition signed by the executive layer, is the deliverable. It is not visible to vendors, it does not require external consultants, and it is the cheapest insurance against the McKinsey-paradox cohort that procures agentic AI but does not capture value because the senior team was not aligned on what was being procured.

Test 2: A named operational candidate workflow with measured baseline and named owner

The second test is whether there is one specific workflow with a documented failure mode and an accountable owner that the deployment will own end-to-end. The procurement-decision question raised by the CFO at scale-up review (what is the net benefit of scaling this from 50 users to 5,000) is unanswerable without this; the AM-140 procurement-committee question 1 on baseline and the AM-010 first operational characteristic on CFO-defensible measurement both depend on the discovery-phase work.

The named-workflow test fails when the discovery output is “we want to use agentic AI” rather than “we want this specific workflow, currently owned by this named team, currently failing in this specific way, instrumented with this baseline metric, to be operated by an agentic deployment with this named owner.” The first is a procurement intent; the second is a procurement question that vendors can actually answer.

The remediation is also internal. Pick the workflow before the first vendor demo, instrument the baseline for 4–6 weeks, name the owner with reporting line on the org chart. The procuring enterprise that arrives at the first vendor demo with this work done evaluates the vendors against its own measurement; the procuring enterprise that arrives without it evaluates vendors against vendor-supplied numbers and inherits all the methodology weaknesses the CFO’s business case piece catalogues.

Test 3: Threat-model literacy on the agentic AI failure classes

The third test is whether the security team understands the agentic AI failure classes before the vendor demo. Two classes specifically: the cross-agent prompt-injection class (AgentFlayer, EchoLeak / CVE-2025-32711, covered in AM-007) and the browser-resident agent class (Anthropic’s Claude for Chrome disclosure with the 23.6% / 11.2% / 0% rates, covered in AM-009).

A security team without literacy on these classes evaluates vendors against an outdated threat model. The deploying enterprise inherits the residual cross-agent and browser-resident exposure regardless of vendor classification; under EU AI Act Article 9, NIS2 Article 21, and DORA where applicable, the operator owns the residual risk. Discovery without threat-model literacy means the procurement decision will be made by a team that cannot articulate the compensating-control burden the deployment will require.

The remediation is one internal session per class, walking through the published research (Zenity Labs on AgentFlayer, NVD on CVE-2025-32711, Anthropic’s published security disclosure on Claude for Chrome, Brave Software’s research on Comet). The five questions in AM-007 and the five questions in AM-009 are the procurement-deck artefacts the security team produces from those sessions.

Test 4: Workforce readiness against the BCG access gap

The fourth test is whether the workforce population that will operate the deployment has the AI-upskilling access the deployment requires. Boston Consulting Group’s October 2024 study reports a 14% frontline-worker vs 44% leader gap in AI upskilling access (BCG, 24 October 2024). The same study finds 74% of companies struggle to achieve and scale AI value at the firm level, a separate signal pointing at the same operational reality.

The Atlanta Fed wage-premium analysis reads the BCG number from the labour-market angle. The deploying enterprise reads it from the deployment-success angle: agentic deployments land on a workforce, and the workforce’s capability to operate alongside the agent is the difference between adoption and rejection regardless of vendor capability.

Dialpad’s CSuite report frames the parallel data-readiness question: 91% of companies lack sufficient data quality for agentic AI deployments at scale, while only 6% have begun meaningful workforce preparation (Dialpad CSuite Report). The 91% data-quality gap is a separate procurement-deck issue; the 6% workforce-preparation figure is the discovery-phase test for this fourth condition.

The remediation requires a budget commitment and a timeline, not a single session. Identify the workforce population the deployment will reach, score it against the BCG 14% baseline, and either commit to closing the gap before deployment scope expands or scope the deployment to the population whose access is already in place.

What the Gartner January 2025 distribution actually tells discovery

Gartner’s January 2025 poll of 3,412 executives places organisations in four postures: 19% have made significant agentic AI investments, 42% have made conservative investments, 31% remain in ‘wait-and-see’ mode, and 8% have made no investments (Gartner).

The procurement-deck misreading of this distribution is to treat the 31% + 8% = 39% non-engaged cohort as a failure of discovery. The opposite is closer to the truth. A meaningful share of that 39% is correctly identifying that the four upstream tests are not yet cleared and that proceeding to procurement would land the organisation in the McKinsey 39% experimenting cohort or the 38% deployed-and-stopped cohort that the McKinsey 23% piece describes.

“Not yet” is a defensible discovery-phase outcome. The 31% wait-and-see population that pairs the wait with active work on the four upstream tests will move to engagement on a stronger footing than the 19% who proceeded to significant investment without that work. Gartner’s 40%+ cancellation projection by end of 2027 is the downstream consequence of mixing those two paths inside the engaged 61%.

What “agent washing” actually looks like and why discovery has to filter for it

Gartner’s same January 2025 release named four agent-washing patterns: products with no live autonomous-action demonstration; products whose autonomy claims are vague and not exercised in the demo; products requiring constant human oversight at every step (which is RPA, not agentic); and products that are rebranded chatbots or RPA tooling.

Discovery-phase filtering against these patterns is operational. Require the live demonstration of the autonomous-action capability against the procuring enterprise’s named candidate workflow (test 2), with the procuring enterprise’s threat-model literacy (test 3) shaping which capabilities the security team allows in scope. The agent-washed product fails the live-demo test by definition; the legitimately agentic product passes the live-demo test and then has to face the cross-agent and browser-resident class questions.

The remediation here is procedural rather than analytical. Build the agent-washing filter into the discovery-phase vendor-screening protocol so the procurement evaluation never reaches the formal stage on a non-agentic product. The cost is one round of additional rigor in the early vendor screen; the benefit is preventing the procurement decision from being made on a product that does not meet the analytical definition.

What the Asia-Pacific discovery work tells us

Deloitte’s Asia Pacific Agentic AI Centre, spanning practitioners across India, Malaysia, and Singapore, frames the regional discovery work as an ecosystem activity rather than a single-firm sprint (Business Today, 11 June 2025). The Centre’s frame matters for the discovery question because it surfaces a pattern: shared discovery work across organisations within an industry or jurisdiction can close the readiness gaps faster than every organisation rebuilding the four upstream tests independently.

For the procurement-deck reader: regional or sector-level discovery work that the procuring enterprise can plug into is procurement-relevant signal. It does not replace the four internal tests, but it can shorten the time-to-clear by giving the senior team, the security team, and the workforce-readiness owner a shared external reference to anchor against.

Holding-up note

The primary claim of this piece (that the agentic AI discovery phase upstream of procurement is an organisational-readiness test, not a vendor-evaluation sprint, and that four upstream tests determine whether the procuring enterprise should proceed at all) is on a 60-day review cadence. Three kinds of evidence would move the verdict.

A subsequent Gartner or analogous executive-poll wave compressing the 19/42/31/8 distribution materially would directly update the central numbers without changing the framing. A new analyst framework or academic publication explicitly proposing a discovery-phase methodology that supersedes the four upstream tests would strengthen the field-level discipline and require the framing to be re-read against the alternative. Regulatory action (EU AI Act post-market monitoring, sectoral regulator) imposing a discovery-phase due-diligence requirement on agentic AI deployments would substantively reshape the variable set; the four tests would remain organisationally valid but would acquire a regulatory layer the current framing does not address.

If any land, the Holding-up record for AM-004 captures what changed, dated. Original claim stays visible. Nothing is quietly removed.

ShareX / Twitter LinkedIn Email

Cite this article

Pick a citation format. Click to copy.

Spotted an error? See corrections policy →

Disagree with this piece?

Reasoned disagreement is a first-class signal here. Every review cycle weighs documented dissent; material dissent becomes part of the article's change history. This is not a corrections form — use /corrections/ for factual errors.

Referenced by · 2 pieces

Part of the pillar

AI agent procurement →

The contracts, SLAs, and evaluation criteria that distinguish agentic-AI procurement from SaaS procurement. 38 other pieces in this pillar.

Agentic AI discovery: what the phase upstream of procurement actually has to test

What discovery actually has to test

Test 1: Definitional clarity across the senior team

Test 2: A named operational candidate workflow with measured baseline and named owner

Test 3: Threat-model literacy on the agentic AI failure classes

Test 4: Workforce readiness against the BCG access gap

What the Gartner January 2025 distribution actually tells discovery

What “agent washing” actually looks like and why discovery has to filter for it

What the Asia-Pacific discovery work tells us

Holding-up note

AI agent procurement →

Related reading

What discovery actually has to test

Test 1: Definitional clarity across the senior team

Test 2: A named operational candidate workflow with measured baseline and named owner

Test 3: Threat-model literacy on the agentic AI failure classes

Test 4: Workforce readiness against the BCG access gap

What the Gartner January 2025 distribution actually tells discovery

What “agent washing” actually looks like and why discovery has to filter for it

What the Asia-Pacific discovery work tells us

Holding-up note

The 60-question agentic AI RFP, built as a procurement tool.

AI agent procurement →

Related reading

AI assistant vs AI agent: when the distinction is procurement-relevant

AI infrastructure water consumption: what the Google 8.1B disclosure and EU 2023/1791 tell procurement

Microsoft 365 Copilot Agent Mode for enterprise: 2026 procurement read

AI-written analysis, signed by a practitioner. One or two pieces a week.

AI-written analysis, signed by a practitioner. One or two pieces a week.