Publication
Corrections
The public record of every correction applied to an Agent Mode AI article. When a tracked claim moves from Holding to Partial or Not holding, the correction is dated and listed here — and never removed. If a piece is retracted entirely (fabricated sources, unfixable errors), that goes on /retractions/ instead.
| Date | Kind | Claim / Article | Note |
|---|---|---|---|
| 10 Jun 2026 | Partial | DMAIC for agentic AI deployment: why the 87% / 27% success gap reflects measurement discipline, not methodology | Primary-source erosion on the headline statistic. The Gravitex page (gravitexgenesys.com/blog/ai-agents-lean-six-sigma-automating-dmaic) no longer carries the 87%/27% success-rate comparison — checked 10 Jun 2026, the URL now serves Six Sigma course-offering content with no AI-deployment success-rate data. A web search found no independent source corroborating the 87/27 pair. The claim's interpretive reading (the gap reflects pre-existing measurement discipline; ISO 9001/ITIL/SRE/HACCP produce the same four conditions) is unaffected and remains supported by Gartner's 7 Apr 2026 I&O survey (57% of failures cited 'too much too fast'). Status Up → Partial until the 87/27 figure can be re-anchored to a retrievable primary source. |
| 10 Jun 2026 | Partial | The unverified citation chain: where enterprise AI decisions actually come from | Claim-text figure unanchored — a same-day reversal of the morning verdict, recorded rather than smoothed over. The batch-1 re-review earlier on 10 Jun 2026 returned Holding after confirming the cited sources were live and unrevised (including the Stanford DEL playbook PDF at its URL) and deferred re-anchoring the 88% figure to the next cycle. A batch-2 full-text extraction the same afternoon disproved the figure: the playbook contains no 88% failure rate and no 12/88 ROI distribution (full finding at AM-029, correction of 10 Jun 2026). URL-liveness is not figure-verification; the morning verdict does not survive the afternoon evidence. The claim's spine (enterprise-AI decisions run on citation chains nobody verifies) holds and is itself illustrated by this incident, but its quantitative anchor — 'the 88% failure rate in enterprise agentic AI' — has no verifiable source as a deployment-failure or ROI distribution. The only verified figure carrying the same numerals is IDC's pilot-graduation finding (roughly 88% of AI proof-of-concepts never reach production; via CIO.com, Mar 2025), a different metric. Status Up -> Partial, same day as the batch-1 Holding verdict. |
| 10 Jun 2026 | Not holding | Why 88% of agentic AI deployments fail | Primary-source verification failed on the headline figure. The claim attributes a 12/88 bimodal ROI distribution (12% of enterprise agentic AI deployments clearing 300%-plus ROI, 88% at or below break-even at 12-18 months) to the Stanford Digital Economy Lab Enterprise AI Playbook (Apr 2026). The full report text contains no such distribution and no ROI-realisation failure data: it is a study of 51 successful deployments (Pereira, Graylin, Brynjolfsson), in which the only 88% figure is '88% of organizations use AI in at least one function' and the only 12% figure is a sponsor-engagement category. The report's design cannot yield a failure distribution. A web search found no Stanford DEL publication reporting 12/88. Gartner (28% of AI I&O projects fully pay off, Apr 2026) and McKinsey (6% high performers, Nov 2025) document a small-tail pattern but do not corroborate the specific 12/88 split the claim asserts. Status Up -> Not holding. The article remains published with this correction log; the governance-over-capability argument is re-anchored, where it appears elsewhere in the corpus, to figures that survive verification. |
| 10 Jun 2026 | Not holding | Why 88% of agentic AI deployments fail | Article restated. The piece at the source URL was rewritten the same day, at the same slug, on the verifiable counterpart figure the fabricated one shadowed: IDC research commissioned by Lenovo (CIO Playbook 2025, Feb 2025; global survey n=2,920) reporting 88% of AI proof-of-concepts failing to reach production, with 4 of every 33 POCs (roughly 12%) graduating, per CIO.com (25 Mar 2025). The restated article asserts a new tracked claim, AM-213, and opens with a correction notice pointing back to this record. This claim stays Not holding as the permanent record of the fabricated attribution; its claim text is unchanged. |
| 10 Jun 2026 | Partial | The CMU 30.3%: the enterprise agent capability gap | One leg unanchored on re-review. The CMU capability figures verify cleanly (30.3% full completion for Gemini 2.5 Pro and 39.3% partial-credit on the 175-task TheAgentCompany set per paper v2; 24% for Claude 3.5 Sonnet in the Dec 2024 v1). The Stanford DEL '12% durable cohort' referenced in the claim text does not exist in the cited source: the Enterprise AI Playbook (Pereira, Graylin, Brynjolfsson, Apr 2026) studies 51 successful deployments and contains no 12/88 ROI cohort (full finding at AM-029, correction of 10 Jun 2026). The claim's capability-constraint argument holds on its own evidence; the sentence tying the constraint to the 12%/88% cohort behaviour has no verifiable referent. Status Up -> Partial. |
| 10 Jun 2026 | Partial | The EU AI Act and agentic AI: what August 2026 actually requires | The 2 Aug 2026 high-risk-obligations leg is overtaken by the EU Digital Omnibus. Council and Parliament reached a provisional agreement on 6-7 May 2026 (confirmed by the Council 13 May 2026) deferring Annex III stand-alone high-risk-system obligations to 2 Dec 2027 and Annex I embedded high-risk obligations to 2 Aug 2028; formal adoption is expected before 2 Aug 2026. What still lands on 2 Aug 2026 is narrower than the claim asserted: Article 50 transparency obligations (with a watermarking grace period to 2 Dec 2026 for systems already in market) plus the penalties and governance architecture. The claim's structural argument is unchanged and current: agentic deployments still do not generate the evidence-of-action layer (Article 12 logs, Article 14 oversight records, post-market monitoring, incident reporting) by default, and building it post-hoc remains the failure mode; the deferral changes the deadline, not the gap. Status Up -> Partial. |
| 10 Jun 2026 | Partial | Anthropic vs OpenAI vs Google vs Microsoft for enterprise agents in 2026 | Extracted-text verification failed on the governance axis of the claim. 'Anthropic's three-cloud BAA position is structurally distinct' overstates Anthropic's own coverage: per Anthropic's BAA documentation (privacy.claude.com, retrieved 10 Jun 2026), Anthropic signs BAAs for the first-party API and HIPAA-ready Claude Enterprise only; the page contains no Bedrock, Vertex, or Azure coverage. Claude consumed via AWS Bedrock or Google Vertex AI is covered by the hyperscaler's BAA, and an Azure-side Anthropic BAA could not be verified. The article's stronger formulation ('Anthropic operates under BAAs with Amazon Web Services, Google Cloud, and Microsoft Azure simultaneously', cited only to a secondary Ampcome blog) is the same overstatement. The substantive point survives restated: BAA-covered Claude deployment surfaces span more clouds than competitors offer, but the BAAs are not Anthropic's across three clouds. This matches the AM-053 correction of the same day. The pricing axis ($0.08 per session-hour plus tokens; Agents SDK no first-party runtime fee) and the ecosystem axis verify and stand. Status Up -> Partial. Article body needs a Peter-approved BAA restate (FAQ x2, body x3, howTo step 1). |
| 10 Jun 2026 | Partial | The State of Enterprise Agentic AI 2026 | One component unanchored on re-review. The claim text cites 'the 12% Stanford DEL high-ROI cohort' against 'the remaining 88-94%'. Full-text verification on 10 Jun 2026 found the Stanford DEL Enterprise AI Playbook contains no 12% high-ROI cohort and no ROI distribution of any kind — it studies 51 successful deployments by design (full finding at AM-029, correction of 10 Jun 2026). The McKinsey 6% AI-high-performer figure (Nov 2025, n=1,993) verifies independently, as do the IAM-posture and EU AI Act runway components. The 'stable bimodal distribution' framing is supported only as a small-tail/large-body shape (Gartner: 28% of AI I&O projects fully paying off; McKinsey: 6% high performers), not as the two-cluster distribution the claim names. The only verified figure carrying the 12/88 numerals is IDC's pilot-graduation finding (roughly 88% of AI proof-of-concepts never reach production; via CIO.com, Mar 2025), a different metric. Status Up -> Partial. |
| 10 Jun 2026 | Partial | The agentic AI readiness diagnostic: 10 questions for the high-performing tail | One of two cohort anchors unanchored on re-review. The claim text pairs McKinsey's 6% AI-high-performer cohort with 'the 12% high-ROI cohort identified by the Stanford Digital Economy Lab' and an '88-94% struggling cohort'. Full-text verification on 10 Jun 2026 found the Stanford DEL Enterprise AI Playbook identifies no 12% high-ROI cohort and no struggling-cohort percentage — it studies 51 successful deployments by design (full finding at AM-029, correction of 10 Jun 2026). The source article additionally cites a 'Stanford Digital Economy Lab 2026 Enterprise AI Productivity Study', a publication title that does not exist. The McKinsey 6% leg verifies (Nov 2025, n=1,993). The ten diagnostic questions stand as an editorial instrument, but the claim's cohort calibration now rests on one verified dataset, not two. The only verified figure carrying the 12/88 numerals is IDC's pilot-graduation finding (roughly 88% of AI proof-of-concepts never reach production; via CIO.com, Mar 2025), a different metric. Status Up -> Partial. |
| 10 Jun 2026 | Partial | The Head of AI Governance role specification, 2026 | Extracted-text verification failed on two parts of the claim. (1) The claim asserts the role 'is now a named operating role in 60% of Fortune 100 enterprises per Forrester's 2026 Enterprise AI Predictions'. Forrester's actual prediction reads '60% of Fortune 100 companies will appoint a head of AI governance in 2026' (Predictions 2026, quoted by CIO Dive, 16 Dec 2025) — a forecast of appointments during 2026, not a measurement of current adoption. The cited sourceUrl (forrester.com/blogs/the-ai-cio-will-govern-outcomes-at-scale/, 9 Apr 2026) contains neither the 60% figure nor any Fortune-100 reference; the article's 'In Q1 2026, Forrester's Enterprise AI Predictions found 60% ... had hired or were actively recruiting' sentence has no locatable source. (2) The compensation bands ($250-450K Director base, $400-700K VP, $600K-$1.2M C-level total comp) could not be located in any primary source and are unlabelled in the article. The six-accountabilities convergence and the executive-committee reporting-line argument are editorial synthesis and stand on their own. Status Up -> Partial. |
| 10 Jun 2026 | Partial | HIPAA-compliant agentic AI: the 2026 healthcare playbook | Extracted-text verification failed on two parts of the claim. (1) The asserted 'OCR's 340% spike in AI-related discrimination complaints (logged in 2025)' cannot be located in any primary source: three targeted searches (10 Jun 2026) across HHS OCR publications, the Section 1557 final-rule coverage, enforcement trackers, and trade press surface no AI-specific complaint-volume series from OCR and no 340% figure anywhere. The article attributes the figure directly to OCR with only the OCR homepage as citation. The figure is unanchored and is treated as failed verification, not as pending. (2) 'Anthropic's three-cloud BAA position' is imprecise: per Anthropic's own BAA documentation, Anthropic signs BAAs for the first-party API and HIPAA-ready Claude Enterprise plans; Claude consumed via AWS Bedrock or Google Vertex AI is covered by the hyperscaler's BAA (AWS Artifact; Google Cloud BAA), not by an Anthropic BAA, and an Azure-side Anthropic BAA could not be verified. The deployment-surface breadth is real; the BAA attribution to Anthropic across three clouds is not. The four deployment conditions (BAA-with-subprocessor coverage, dual 164.312(b)+Article-12 logging, minimum-necessary PHI mapping, clinical-correctness drift monitoring) are editorial architecture and stand. Status Up -> Partial. |
| 10 Jun 2026 | Partial | Mid-market agentic AI ROI in 90 days: what the cited data actually supports vs the vendor pitch | One of three read-against anchors unanchored on re-review. The claim text cites 'Stanford Digital Economy Lab Enterprise AI Playbook (12/88 bimodal ROI distribution at 12-18 months)' and frames the realistic ROI band around 'the highest-discipline 12% cohort'. Full-text verification on 10 Jun 2026 found the playbook contains no 12/88 distribution, no bimodal ROI shape, and no 12-18-month ROI measurement point (full finding at AM-029, correction of 10 Jun 2026). The claim's core negative finding — no mid-market enterprise has produced a documented +240% ROI in 90 days under audited conditions — is unaffected; the McKinsey State of AI 2025 and MIT NANDA legs verify and continue to support it. The '12% cohort' framing has no verifiable referent. The only verified figure carrying the 12/88 numerals is IDC's pilot-graduation finding (roughly 88% of AI proof-of-concepts never reach production; via CIO.com, Mar 2025), a different metric. Status Up -> Partial. |
| 10 Jun 2026 | Partial | The two-cohort split in enterprise agentic AI outcomes: why the high-performing tail is structurally distinct | One of four legs unanchored on re-review. The claim text attributes '12% of deployments clearing 300%+ ROI with 88% at or below break-even at 12-18 months' to the Stanford DEL 2026 Enterprise AI Playbook. Full-text verification on 10 Jun 2026 found no such figure in that source: the playbook (Pereira, Graylin, Brynjolfsson, Apr 2026) studies 51 successful deployments by design and contains no ROI distribution, no 300%-plus cohort, and no break-even measurement point (full finding at AM-029, correction of 10 Jun 2026). The only verified figure carrying the same 12/88 numerals is IDC research with Lenovo (via CIO.com, Mar 2025): roughly 88% of AI proof-of-concepts never reach production and roughly 12% graduate — a pilot-to-production graduation metric, not an ROI distribution. The Gartner 28%, McKinsey 23%/17%, and MIT NANDA 95% legs verify; they support a small high-performing tail and a large struggling body, but none documents the two-peak bimodal shape the claim asserts. Status Up -> Partial. |
| 10 Jun 2026 | Partial | Enterprise AI cost and ROI in 2026: what the evidence actually shows | One of four named datasets unanchored on review. The claim text names 'Stanford DEL's 12% clearing 300%+ ROI vs 88% at or below break-even' as one of four independent datasets. Full-text verification on 10 Jun 2026 found the Stanford DEL Enterprise AI Playbook contains no such distribution — it studies 51 successful deployments by design and carries no ROI-realisation failure data (full finding at AM-029, correction of 10 Jun 2026). The McKinsey (23% scaling, 17% EBIT-attribution), Gartner (28% fully paying off), and MIT NANDA (95% no measurable P&L impact) datasets verify; the claim's spine stands on three datasets rather than four. The only verified figure carrying the 12/88 numerals is IDC's pilot-graduation finding (roughly 88% of AI proof-of-concepts never reach production; via CIO.com, Mar 2025), a different metric from an ROI distribution. Status Up -> Partial. |
| 10 Jun 2026 | Partial | AI in the small law firm: what the published 2026 case-study corpus shows | Vendor attribution error in the claim text. The claim names Polley Faith among 'Spellbook with named small-firm customers Westaway, KMSC Law, Polley Faith'. Polley Faith LLP is a Harvey-listed law-firm customer, not a Spellbook customer: the live Spellbook site (now spellbook.com; spellbook.legal 301-redirects) names Westaway, KMSC Law, and McInnes Cooper with no Polley Faith, and the source article's own body correctly places Polley Faith on Harvey's roster — the claim text and the article excerpt bundled it with the wrong vendor at publish. The remaining legs verify against extracted source text on 10 Jun 2026: Anthropic's GC AI customer story carries 'More than 1,500 companies' and '14 hours saved per week on average ... based on a survey of more than 100 active customers' verbatim; Harvey's published roster (Thompson Hine, Fox Rothschild, Lowenstein Sandler, Polley Faith) matches; ABA Formal Opinion 512 remains the governance baseline. The corpus reading (AI ships at 1-to-20 lawyer scale; privileged work stays on Enterprise-tier zero-retention access) is unaffected. Status Up -> Partial. |
| 10 Jun 2026 | Partial | AI client proposals for solo founders: which tools survive a buyer's read | One named member of the generation cluster was already defunct at publication: Tome shut down its presentation/narrative product (Tome Slides) in March 2025 and pivoted to sales tooling, with the brand later sold to AngelList (deckary.com shutdown timeline; signalhub.substack.com post-mortem, both checked 10 Jun 2026). The generation cluster reduces to Pitch + Gamma. The two-cluster thesis itself is unaffected and arguably strengthened — the pure AI-narrative product failed to find a sustainable business while Gamma (70M users, $100M ARR as of Nov 2025) and the assembly cluster (PandaDoc, Better Proposals, Proposify per Luniq 2026 agency comparison) both compound. Status Up → Partial for the factual error in the tool list. |
| 10 Jun 2026 | Partial | Stack IA pour micro-entrepreneur BNC en France: ce que URSSAF et le plafond de 83 600 € imposent | The micro-regime ceiling figure in the claim text is overtaken: the 2026-2028 triennial revalorisation raised the services/BNC plafond from 77 700 € to 83 600 € (mixed-activity global cap 203 100 €), per LégiFiscal's 2026 thresholds bulletin and the URSSAF 2026 seuils announcement; the cited impots.gouv.fr page still presents the 77 700 € figure under its 2023-2025 framing. The claim's structural analysis is unaffected — the 34% abattement forfaitaire remains fixed (BOI-BNC-DECLA-10-70), AI subscriptions remain non-deductible under micro-BNC, and the velocity-to-ceiling advice stands with the crossover forecast now running against 83 600 €. Secondary note correction: the TVA franchise for services in 2026 is 37 500 € base / 41 250 € majoré (service-public.fr F21746, extracted 10 Jun 2026: 'Pour l'année 2026, les seuils de franchise en base de TVA française applicables restent inchangés' and the single 25 000 € threshold proposal 'a été abandonnée'); the 36 800 € figure carried in the article body and prior note was the 2023-2024 value. Status Up -> Partial. Article body needs a Peter-approved threshold refresh: 77 700 € appears in the title, excerpt, supportingFigure, FAQ, and body; 36 800 € in FAQ and body; the slug carries no figure and is unaffected. |
| 10 Jun 2026 | Partial | Colorado's AI law hits June 30: what the SB 189 replacement means for the 1-50 person operator using AI in hiring or client decisions | Trigger condition (2) fired: the effective date moved. Governor Polis signed SB 26-189 on 14 May 2026 (Holland & Knight client alert, May 2026; Seyfarth; Littler). The signed law repeals and reenacts the original Colorado AI Act and its obligations take effect 1 Jan 2027 — not 30 Jun 2026 as the claim asserted. No operator obligation starts 30 Jun 2026; the only pre-2027 item is Colorado AG rulemaking due by 1 Jan 2027. The claim's structural reading holds (risk-management programmes and impact assessments dropped for a notice-and-transparency framework; consequential-decision scope covering employment, housing, credit, insurance, education, healthcare; no small-firm exemption). The urgency leg ('obligations from 30 June 2026') is overtaken. Status Up → Partial. |
| 28 May 2026 | Partial | GPT-5 Pro at $200 a month: what the pricing tier signals to enterprise IT | Pricing/model drift: a $100/mo Pro tier now sits beside the $200 tier (added 9 Apr 2026) and the premium model is GPT-5.5 Pro. Core thesis holds; the single-$200-tier framing no longer matches. Re-verify current tiers at chatgpt.com/pricing. |
| 28 May 2026 | Partial | Notion AI vs ClickUp Brain in 2026: which one earns its seat for a 5-person consultancy | Price drift: Notion Business with bundled AI now about $15/seat annual ($20 monthly) vs cited $19.50; ClickUp Brain now $7/seat vs cited $9. Verdict logic unchanged; figures need updating. |
| 2 May 2026 | Partial | AI in IT operations: what is actually shipping in 2026, and what the savings really look like | Klarna walk-back primary-source upgrade — added Siemiatkowski verbatim quotes via Bloomberg-cited-by-Fortune (9 May 2025) and the Uber-style freelance hiring detail via Entrepreneur. Closes the highest-priority evidence gap from the source dossier. |
| 19 Apr 2026 | Partial | Salesforce's 9,000-person redeployment: the template most enterprises will copy | Anchor verification complete (see audit/ANCHOR_VERIFICATION_2026-04-19.md). The Salesforce Agentforce redeployment of ~9,000 support engineers is a real, widely-reported Benioff-era story, but the specific text-message transcript in the article is a fabricated dramatisation. Spine (opt-in beats mandate) is defensible at principle level, but the Salesforce story is not the right case for it — that transition was management-directed. Rewrite flagged for before 18 Jun 2026 review. |
| 19 Apr 2026 | Partial | Salesforce's 9,000-person redeployment: the template most enterprises will copy | Body rewritten. Fabricated text-message transcript removed. Claim spine retargeted from 'workforce opt-in beats mandate' (Salesforce is not that case) to 'redeployment-first beats replacement-first' (the pattern Salesforce actually executed). Status moves from Partial to Up. Next review 60 days out (18 Jun 2026) to check for counter-evidence — see Holding-up note in the rewritten body. |
| 19 Apr 2026 | Partial | Back-office vs front-office: where agentic AI's economics actually compound | Anchor verification complete (see audit/ANCHOR_VERIFICATION_2026-04-19.md). 'Sarah Chen' and the 2 AM Munich-hotel scenario are fully fabricated — the article's narrative protagonist does not correspond to any real executive. The underlying framework (back-office cost compounding faster than front-office wins; per-action delta × frequency) IS defensible against McKinsey + Futurum operational-AI-ROI data. Rewrite required before the article can move to Holding. |
| 19 Apr 2026 | Partial | Back-office vs front-office: where agentic AI's economics actually compound | Body rewritten. Fabricated 'Sarah Chen' narrative frame removed entirely. Claim spine sharpened: original was 'back-office cost compounding faster than front-office'; new version adds the structural explanation (per-action × frequency × task-specification × measurement instrumentation) and specific 2026 benchmark anchors (Stanford DEL 12%/88%, Gartner 28%, Futurum 71% vs 40%). Status moves from Partial to Up. Cross-links to AM-020 (TCO), AM-021 (measurement discipline), AM-022 (bimodal ROI) explicitly drawn in the body. Next review 18 Jun 2026. |
| 19 Apr 2026 | Partial | Multi-agent systems in manufacturing: the 30% downtime claim, examined | Body rewritten. Original headline number (30% downtime reduction) survives against current case-study data. New analytical spine: the audit-trail architecture separates wins from stalls. Status moved from rewrite-in-progress Partial placeholder to Up. Next review 60 days out because architectural claims age slower than pricing claims. |
| 19 Apr 2026 | Partial | The hidden costs of agentic AI: a CFO's guide to true TCO and ROI modeling | Body rewritten from WP-era slop. Status moves from rewrite-in-progress placeholder to Up. New analytical spine: the TCO underestimate is cross-departmental cost-attribution failure, not hidden costs. Five cost categories named with budget owners. 60-day review cadence. |
| 19 Apr 2026 | Partial | DMAIC for agentic AI deployment: why the 87% / 27% success gap reflects measurement discipline, not methodology | Body rewritten from WP-era slop. Status moves from rewrite-in-progress placeholder to Up. New thesis: the causation runs the opposite direction from the vendor narrative — the measurement discipline was the prerequisite, the methodology name doesn't matter. 60-day review. |
| 19 Apr 2026 | Partial | The agentic AI success formula: what 171% average ROI actually hides | Body rewritten from WP-era slop (7-patterns vendor framework with fabricated case studies). New thesis: bimodal distribution, not normal — the 171% average describes no specific deployment. Business-line kill-switch ownership is the single distinguishing factor. Cross-links to AM-020 + AM-021 on the shared organisational-precondition thread. |
| 19 Apr 2026 | Partial | Google AI Mode restaurant booking: the template for every partner-aggregation vertical | Body rewritten from WP-era slop (the '$50 Billion Revolution' headline and 'act within 90 days' crisis-FOMO framing were both fabrications). New thesis: restaurant booking is a template, not the story. Named 5 enterprise-relevant aggregation verticals (business travel, expense, procurement, ATS, HR service) and the API-backend-vs-destination choice incumbents face. Next review in 60 days. |
Correction policy → editorial standards · Retractions → permanent record