Skip to content
Method: every claim tracked, reviewed every 30–90 days, marked Holding, Partial, or Not holding. Drafted by Claude; signed off by Peter. How this works →
AM-214pub10 Jun 2026rev10 Jun 2026read11 mininRisk & Governance

Anatomy of a fabricated statistic: the 52-day life of the Stanford 12/88

On 19 Apr 2026, in an editorial pass meant to remove fabrication, this publication created some: a real IDC finding fused with Stanford's name and invented methodology. The figure reached 30 articles, eight claim texts and a podcast episode before full-text source extraction caught it on 10 Jun 2026. The complete record.

Holding·reviewed10 Jun 2026·next+28d

Bottom line: This publication published a statistic that does not exist. It entered on 19 Apr 2026 inside an editorial rewrite meant to remove fabricated content, propagated to 30 published articles and 8 immutable claim texts, and survived 52 days of scheduled reviews because those reviews verified that the source URL was live, not that the figure was in the source. On 10 Jun 2026, full-text extraction of the Stanford Digital Economy Lab Enterprise AI Playbook disproved it; the same day, 8 claim verdicts changed, the Claim Archive recorded its first retraction, and roughly 120 occurrences were restated or softened. This is the complete record, tracked as AM-214.

A publication whose differentiator is verification discipline fabricated a statistic, ran it as a signature exhibit for seven weeks, and was eventually caught by its own ledger. That sentence is the least flattering accurate description of what follows, so it goes first. The rest of this piece is the forensic trail: where the figure came from, how it spread, why the review system missed it, what caught it, and what changed. Nothing in the affected corpus was silently deleted; every change carries a dated correction entry, and the corrections index lists them all.

The timeline

DateEvent
18 Apr 2026WordPress-era corpus migrated into the current system. It carries scattered vendor ROI claims and unrelated 88-percent figures, but no 12/88 split and no Stanford attribution.
19 Apr 2026The figure is born in commit 0453563, a rewrite of the AM-022 article that was itself part of the anti-slop remediation pass: a two-cluster ROI segmentation (“roughly 12%, roughly 88%”) is attributed to the Stanford DEL playbook, which contains no such segmentation.
19 to 20 Apr 2026Two more article rewrites repeat it. It becomes a Claim Archive entry (ACA-2026-003) with a Wayback snapshot of a PDF that does not say it, and enters the immutable claim text of AM-024.
24 Apr 2026The signature article ships under AM-029. The hedge is gone. The figure now has invented precision: a 12-to-18-month measurement window, “no middle cluster”, and co-authors the real report does not have.
Apr to Jun 2026Propagation: 30 published articles, 8 claim texts, a published podcast episode, the GAUGE framework’s calibration note, public CC-BY data exports, two welcome-sequence emails, and a journalist pitch sent in April 2026.
10 Jun 2026, morningRe-review batch 1 returns Holding for AM-024 after confirming the cited sources are live and unrevised. The 12/88 re-anchoring is deferred to the next cycle.
10 Jun 2026, afternoonRe-review batch 2 applies full-PDF text extraction to the Stanford source. The figure is absent. AM-029 goes to Not holding; the morning verdict on AM-024 is reversed the same day.
10 Jun 2026, same dayFull exposure map and corpus-wide cascade: 8 claim verdicts changed, the archive’s first retraction, roughly 120 occurrences restated or softened, the signature article restated at its original URL under AM-213, GAUGE changelog v1.0.2.

What the figure was made of

The fabricated statistic was not invented from nothing, which is part of why it read as plausible. It was a chimera assembled from four real parts:

  • A real finding with the same numerals. IDC research commissioned by Lenovo found that 88% of AI proof-of-concepts never reach widescale deployment, with four of every 33 graduating to production, per CIO.com’s reporting of 25 Mar 2025. That is a pilot-graduation rate. The fabrication kept the 12/88 numerals and changed what they measure.
  • A real institution’s name. The Stanford DEL Enterprise AI Playbook (Elisa Pereira, Alvin Wang Graylin and Erik Brynjolfsson, April 2026) is real and was cited correctly elsewhere in the corpus. It studies 51 successful deployments across 41 organizations. It contains no failure distribution of any kind.
  • A vendor benchmark’s ROI flavour. Vendor adoption reports in circulation at the time carried high headline ROI averages with high-performing tails. The fabrication borrowed that shape for its “300%-plus cohort”, a figure no primary source attaches to a 12/88 split.
  • Invented methodology. The 12-to-18-month measurement window, the “no middle cluster” bimodality, the dataset description, and in one article the co-author names were all invention. One piece later inflated the dataset to “approximately 600 deployments”, contradicting the corpus’s own n=51 elsewhere.

Each component alone would have been checkable. Fused, the figure carried the credibility of its real parts. That is the anatomy worth understanding, because it is not specific to this publication: it is what plausible fabrication looks like in any AI-assisted research pipeline.

How far it traveled

The 10 Jun 2026 exposure map (the internal investigation document, preserved in the repository at docs/editorial/stanford-1288-exposure-map-2026-06-10.md) counted the spread: 30 published enterprise articles, with the figure as the load-bearing thesis in three of them and the lead exhibit in a fourth. Eight claims carried it inside their immutable claim text. It anchored the calibration note of the publication’s own GAUGE framework. It shipped in a published podcast episode, in the public CC-BY data exports of the Claim Archive, in two welcome-sequence emails, and in a journalist pitch sent in April 2026.

Two mechanisms drove the spread, and neither was carelessness in the ordinary sense. The first is internal cross-citation: each new article cited earlier ones as corroboration, so the figure hardened with every use. The second is that the publication’s own discipline amplified it. The claim-tracking system did exactly what it promises, registering the figure as a tracked claim with a source snapshot, which gave it the appearance of having been verified. A fabricated statistic inside a verification system inherits the system’s credibility. That is the uncomfortable observation this piece exists to put on the record.

There is also a contamination question that extends beyond this site. Versions of “88% of agents fail, 12% succeed” now circulate in 2026 trade coverage with rotating attributions, and some of that circulation is plausibly downstream of this publication’s own distribution, which included one of the site’s most AI-cited URLs. Readers who picked the figure up from here and republished it deserve this correction explicitly: no primary source documents a 12/88 ROI distribution, from Stanford or anyone else.

Why 52 days of reviews missed it

Between 19 Apr 2026 and 10 Jun 2026 the affected claims passed through scheduled re-reviews repeatedly, and the figure survived every one. The reviews were not skipped. They were performed against the wrong standard: each verified that the cited source URLs were live and unrevised, which the Stanford PDF always was. Confirming HTTP 200 on a PDF tells you nothing about whether the PDF says what your article claims it says.

The ledger preserves the failure in its sharpest form. On the morning of 10 Jun 2026, re-review batch 1 returned Holding for AM-024 after confirming its sources were live, deferring the 12/88 re-anchoring to a later cycle. That same afternoon, batch 2 extracted the full text of the 116-page Stanford PDF and searched it. Every 88 in the document measures something else: 88% of organizations using AI in at least one function, a single company’s 88% coding-productivity gain, and data unlocked in 88% of the report’s cases (all in the source PDF). The word bimodal appears zero times. There is no 300%-plus cohort and no break-even body. The morning verdict was reversed within hours, and both verdicts are preserved in the claim’s record, because a correction system that smooths over its own same-day reversal is not a correction system.

The correction, by the numbers

Peter ran the full protocol on 10 Jun 2026 rather than the scoped version, on the reasoning that correcting only the load-bearing instances is the “silent fixing” this publication disavows. The committed record:

  • 8 claim verdicts changed. AM-029, the signature claim, to Not holding. Seven claims to Partial: AM-024, AM-031, AM-040, AM-042, AM-129, AM-132 and AM-201, each with a dated correction entry explaining which leg failed. Ten further claim notes were re-anchored without verdict changes.
  • The Claim Archive’s first retraction. ACA-2026-003, the archive entry that registered the figure as a tracked primary-source claim, is now status Retracted with a full review memo: the archived claim text was never present in the archived source. The public archive data exports were regenerated.
  • Roughly 120 occurrences across 30 article bodies restated or softened: about 50 restated on the verified IDC/Lenovo graduation metric with the explicit not-an-ROI-distribution distinction, about 70 softened to qualitative language. The invented “600 deployments” dataset and an invented report title were removed.
  • The signature article restated at its original URL. The why-88 piece now opens with its correction notice and rests on the verifiable figure the fabricated one shadowed, tracked as the new claim AM-213. The original claim text stays visible in the ledger under AM-029.
  • Framework provenance corrected. The GAUGE framework’s v1.0 changelog claimed its weights were calibrated against the Stanford distribution. Changelog v1.0.2 withdraws that citation: the weights were editorial judgment, unchanged, and the amendments page records why.

After the cascade the public ledger stood at 232 Holding, 22 Partial and 7 Not holding. Three surfaces remain open and are tracked: the published podcast episode’s descriptions still carry the figure pending correction, the live welcome-email automations need their cleaned repo copies pasted across, and downstream third-party circulation cannot be edited, only answered with this record.

What the verified figure actually supports

The restatement is not a retreat into vagueness; the real figure is useful, it just measures something else. IDC’s research, per CIO.com, found 88% of observed POCs do not make the cut to widescale deployment, and its stated root causes are organizational: unclear ROI, insufficient AI-ready data, and a lack of in-house expertise. Ashish Nadkarni, group vice president at IDC, told CIO.com on 25 Mar 2025: “Most of these gen AI initiatives are born at the board level. And a lot of this panic-driven thinking is what caused a lot of these initiatives.”

That quote survives full-text verification. It took one fetch of the article to confirm. The cost asymmetry is the lesson: verifying the quote took minutes; not verifying the 12/88 cost 30 articles, 8 claim corrections, a retraction and this piece.

What this means beyond this publication

The reflex reading of this incident is that an AI-written publication fabricated a statistic, which confirms whatever the reader already believed about AI-written publications. The more useful reading is about the verification standard, because the failure mode is not exclusive to AI writing. The 12/88 passed seven weeks of human-approved review cycles that checked source liveness. Citation chains in analyst notes, board decks and trade coverage run on the same standard: the link resolves, the institution is real, the number is repeated. The publication’s own analysis of unverified citation chains described exactly this mechanism in the wild, while running a fabricated figure as its lead exhibit. The thesis was proved on its own author.

For organizations consuming AI-generated or AI-assisted research, the procurement-grade question to ask any vendor, analyst or publication is concrete: does your verification process locate the cited figure in the extracted text of the primary source, or does it confirm the source exists? Those are different controls, and the gap between them is where this incident lived for 52 days. A URL check validates the citation’s formatting. Only a source-text check validates the citation.

The process fix, and the standing invitation

The fix is one rule, mandatory since 10 Jun 2026: verifying a figure means locating it in the extracted text of the primary source. Applied to the very next re-review batch the same day, it caught three more figure-not-in-source cases that URL checks had passed for weeks: a Forrester prediction quoted as a present-tense fact, an enforcement statistic findable in no source at all, and a customer attributed to the wrong vendor. All three are logged as Partial in the public ledger with their corrections. The rule finds real failures at a measurable rate, which is the strongest argument that the old standard was the vulnerability.

What this incident does not do is settle whether the publication’s model works. It is one severe failure caught by the system’s own instruments, 52 days late. Readers should weigh both halves of that sentence, and the ledger makes both checkable: every claim this publication has ever tracked, including the eight this incident moved and the one this article asserts, is public at /holding/, with append-only correction logs. The editorial standards, including what changed after this incident, are at /standards/. Check any claim. That invitation was the product before this happened; after it, the invitation is the evidence.

Holding-up note

The primary claim of this piece, tracked as AM-214: the correction record for the fabricated 12/88 statistic is complete across the written corpus and public, with 8 claim verdicts changed, the archive’s first retraction, and roughly 120 occurrences restated or softened, none silently deleted. Because the claim is about record-completeness, it sits on a 30-day first review instead of the usual 90. Evidence that would move the verdict:

  • Any occurrence of the 12/88-as-ROI-distribution figure found uncorrected in the written corpus.
  • Any affected correction entry found edited rather than appended.
  • The open outbound surfaces (podcast descriptions, live welcome-email copies) still uncorrected at first review.
ShareX / TwitterLinkedInEmail
Cite this article

Pick a citation format. Click to copy.

Spotted an error? See corrections policy →

Disagree with this piece?

Reasoned disagreement is a first-class signal here. Every review cycle weighs documented dissent; material dissent becomes part of the article's change history. This is not a corrections form — use /corrections/ for factual errors.

Part of the pillar

Agentic AI governance

Governance frameworks, oversight patterns, and compliance postures for enterprise agentic-AI deployment. 62 other pieces in this pillar.

Related reading

Vigil · 80 reviewed