Skip to content
Method: every claim tracked, reviewed every 30–90 days, marked Holding, Partial, or Not holding. Drafted by Claude; signed off by Peter. How this works →
AM-210pub9 Jun 2026rev9 Jun 2026read5 mininUnderstanding AI

What is agent washing, and how do you test for it

Gartner assesses only about 130 of the thousands of self-described agentic-AI vendors as delivering real capability, while more than 80% of organisations intend to deploy within two years. That gap is the agent-washing window, and the defence is a capability test, not a label.

Holding·reviewed9 Jun 2026·next+83d

Bottom line. Agent washing is Gartner’s term for rebranding existing products, AI assistants, RPA and chatbots, as agentic AI without substantial agentic capabilities. The scale: of the thousands of vendors using the label, Gartner’s April 2026 Hype Cycle assesses only about 130 as delivering real capability, while 17% of organisations have deployed agents and more than 80% expect to within two years. That gap between intent and deployment is the washing window, and the defence is a capability test applied before contract, not after.

Report. Gartner published its first Hype Cycle for Agentic AI on 2 Apr 2026 and named the problem directly: agent washing, “the rebranding of existing products, such as AI assistants, robotic process automation (RPA) and chatbots, without substantial agentic capabilities.” The numbers around the term do the work. Per the Hype Cycle, 17% of organisations have deployed AI agents, 42% plan to within 12 months and a further 22% within the following year, more than 80% inside two years, while of the thousands of self-described agentic vendors, only about 130 are assessed as delivering real agentic capability. On 20 May 2026, Gartner repeated the warning for supply-chain software at its Barcelona symposium, concluding that vendors claiming end-to-end autonomous planning before 2027 are overstating near-term possibilities.

The analyst behind the May warning put the buyer’s position plainly:

“SCP leaders should prepare for an agentic AI future, but they need to separate meaningful capability from market noise. The priority today is not full autonomy, but building the operational discipline, architectural flexibility and decision frameworks that allow agentic AI to scale as the technology matures.”

— Jan Snoeckx, Senior Director Analyst, Gartner, at the Supply Chain Symposium, 20 May 2026.

SignalWashed productReal agentic capability
Path through the workpre-drawn flowchartsequences its own steps toward a goal
Action surfacetext replies, canned actionsacts through tools on live systems
Memoryper-messagecarries state across the task
When a step failserrors out, escalates by defaultreplans, retries, escalates by judgment
Honest labelRPA, chatbot, assistantagent

The four properties column condenses the capability bar Gartner’s analyses describe; the test below operationalises it.

Why the term exists

Observe. Washing happens where intent runs ahead of verification, and the Hype Cycle numbers describe exactly that market: four in five buyers intend to deploy within two years, one in six has, and the supply side has rebadged to meet the demand. The pattern is not new, cloud washing and AI washing preceded it, but the agentic version is more consequential because the claim is about autonomy: a mislabelled chatbot wastes a licence, while a mislabelled “autonomous” system either disappoints safely as a demo or gets trusted with work it cannot actually carry. Gartner’s standing prediction from June 2025, that over 40% of agentic AI projects will be cancelled by the end of 2027 on escalating costs, unclear business value or inadequate risk controls, is partly the bill for deployments bought on the label.

This is the vendor-claim counterpart of the car-wash test: that read tests whether a model has the robustness its benchmarks imply; this one tests whether a product has the autonomy its marketing implies. A buyer needs both, because a genuine agent on a weak model and a washed product on a strong model fail in different ways.

The buyer’s test

Reflect. Real agentic capability has four observable properties: the system sequences its own steps toward a goal rather than executing a flowchart, acts on the environment through tools, carries state across the task, and handles deviation by replanning rather than erroring out. Each is testable in a procurement demo, on your workflow rather than the vendor’s, with the vendor’s hands off the keyboard. The questions that separate the columns in the table: show me the steps the system decided, versus the steps a developer scripted; break a step mid-run and show me what it does; show me the audit trail of an actual run, the same evidence the procurement playbook and the 60-question RFP read formalise.

Share thoughts. Treat the label as marketing and the test as the contract. Run the deviation test before signing, write the demonstrated capabilities, not the brochure’s, into the statement of work, per the vendor-contract gotchas, and score the deployment, not the demo, with GAUGE before scaling it, the discipline the McKinsey scaling-gap read shows separates the cohorts. And keep Gartner’s own humility in frame: if full autonomy in a domain as instrumented as supply-chain planning is overstated before 2027, a vendor selling you full autonomy today in a messier domain is describing their roadmap, not their product. Buy the bounded version; it is the one that works.

Holding-up note

The primary claim of this piece (that agent washing is the dominant noise source in the 2026 agentic market, with only about 130 of thousands of vendors assessed by Gartner as delivering real capability against more-than-80% two-year deployment intent, and that a pre-contract capability test for goal-directed multi-step autonomy is the buyer’s defence) is on a 90-day review cadence. Three kinds of evidence would move the verdict: a subsequent Gartner assessment materially revising the real-capability vendor count or the intent figures; certification or standards (for example, ISO 42001-adjacent profiles) making capability claims independently verifiable, which would obviate the manual test; or market data showing label-bought deployments succeeding at the same rate as test-bought ones, which would falsify the defence half. The Holding-up record for AM-210 captures what changes, dated. Figures are from Gartner’s published research as of 9 Jun 2026.

ShareX / TwitterLinkedInEmail
Cite this article

Pick a citation format. Click to copy.

Spotted an error? See corrections policy →

Disagree with this piece?

Reasoned disagreement is a first-class signal here. Every review cycle weighs documented dissent; material dissent becomes part of the article's change history. This is not a corrections form — use /corrections/ for factual errors.

Referenced by · 2 pieces
Part of the pillar

AI agent procurement

The contracts, SLAs, and evaluation criteria that distinguish agentic-AI procurement from SaaS procurement. 38 other pieces in this pillar.

Related reading

Vigil · 74 reviewed