The widely-cited 95-percent generative-AI-pilot-failure framing (MIT Sloan Management Review and Boston Consulting Group adoption-research streams, 2025-2026) is methodologically defensible for the enterprise cohort the research sampled (large firms with dedicated AI functions, 12-to-18-month evaluation windows, scaled-production-deployment success definition) and materially misrepresents small-firm pilot dynamics. The 1-to-50-person operator cohort has a different failure-mode catalogue (tool-assigned-to-wrong-person, rewrite-cost-exceeds-savings, client-rework-from-AI-deliverable, line-item-stack-compounded-and-cancelled, sporadic-use-no-routine) and a different success definition (90-day payback at actual hourly rate; deliverable quality reaching the client without disproportionate rework; routine fit documented for handover). A three-question Monday-morning small-firm pilot test (payback, deliverable quality, routine fit) checked at 30 days and 60 days is the operator's actual evaluation instrument and replaces the enterprise 12-to-18-month evaluation cycle that the 95-percent number is measured against.
Operators register pillar piece on pilot-evaluation framework for 1-to-50-person firms. 45-day cadence calibrated so the first review falls within the typical 30-to-60-day pilot decision window. Trigger conditions for status changes: (1) MIT Sloan, BCG, or a comparable adoption-research stream publishing small-firm-specific (1-to-50-person) pilot-failure data inside the review window (would either confirm the structural argument that the enterprise framing misclassifies, or refine the small-firm failure-mode catalogue with new evidence — keep Holding either way unless the evidence directly contradicts the load-bearing claim); (2) a small-firm operator survey from Stripe Atlas, Brex, Ramp, or equivalent SMB-spend-and-adoption data publishers producing pilot-outcome data at the cohort level that materially diverges from the five failure modes listed (would refine the catalogue and could move toward Partial if a sixth or seventh mode dominates the new data); (3) a major foundation-model provider publishing operator-cohort case studies with attributable revenue impact at the 1-to-50-person scale (would harden the success-definition argument by establishing what small-firm pilot success looks like in the public record); (4) a viral re-citation of the 95-percent number with new methodology that does include small firms (would update the source-document discussion in the piece). Sibling claim: AM-146 (three questions for CIOs about agentic AI accuracy claims) addresses the analogous misread problem in the enterprise cohort.
/holding/OPS-069/Embed this claimiframe + oEmbed
The card auto-updates when the claim's status, last-reviewed date, or correction log changes. Embedders never need to refresh — the card is rendered live from the canonical record.
Email-me when OPS-069's status, next review date, or correction log changes. One email per change. No newsletter subscription, no other mail.
The claim: The widely-cited 95-percent generative-AI-pilot-failure framing (MIT Sloan Management Review and Boston Consulting Group adoption-research streams, 2025-2026) is methodologically defensible for the enterprise cohort the research sampled (large firms with dedicated AI functions, 12-to-18-month evaluation windows, scaled-production-deployment success definition) and materially misrepresents small-firm pilot dynamics. The 1-to-50-person operator cohort has a different failure-mode catalogue (tool-assigned-to-wrong-person, rewrite-cost-exceeds-savings, client-rework-from-AI-deliverable, line-item-stack-compounded-and-cancelled, sporadic-use-no-routine) and a different success definition (90-day payback at actual hourly rate; deliverable quality reaching the client without disproportionate rework; routine fit documented for handover). A three-question Monday-morning small-firm pilot test (payback, deliverable quality, routine fit) checked at 30 days and 60 days is the operator's actual evaluation instrument and replaces the enterprise 12-to-18-month evaluation cycle that the 95-percent number is measured against.
About this register
The Operators register tracks claims published from practitioner-advisory pieces addressed to solo founders, micro-SMB, and small businesses up to around fifty people. Claims are reviewed on a 30–45 day cadence — tooling and SMB-relevant pricing shift faster than enterprise procurement signals.
Recent corrections in Operators
- OPS-036 · Partial · 29 Apr 2026
Initial publication 29 Apr 2026. Status set to Partial at publication because clause 6 commentary references an order-of-magnitude remediation-cost gap derived from the IAPP 2024 AI Governance Profession Report; the report characterises the gap as material but does not publish a precise multiple, so the wording is annotated source: our-estimate. REVIEW: Peter to source a precise figure or amend the commentary.
- OPS-035 · Holding · 29 Apr 2026
Initial publication 29 Apr 2026. Status set to Partial at publication because category 5 lacks the same regulatory/cited-consequence anchor as categories 1-4. REVIEW: Peter to confirm category 5 evidence base and either upgrade to Holding (with strengthened citation) or amend the claim to four categories.
- OPS-034 · Holding · 29 Apr 2026
Initial publication 29 Apr 2026 with status=partial. Cost-side claims (vendor pricing) verifiable against the four cited pricing pages on the publication date. Time-recovery claim (90+ min compressed to ~20 min) drawn from published productivity-blogger benchmarks rather than Peter-run measurement; first-cohort replication on the publication's tracked operator cohort due by 13 Jun 2026. REVIEW: Peter.
Reviews coming up in Operators
- OPS-005 · Holding · next +8d (26 May 2026)
At sub-1M tokens per month (typical SMB agent volume) in 2026, the absolute dollar gap between Claude Haiku 4.5, GPT-4o…
- OPS-003 · Holding · next +8d (26 May 2026)
For a solo founder choosing exactly one consumer AI subscription at around $20/month in 2026, the choice between Claude…
- OPS-002 · Holding · next +8d (26 May 2026)
For a 5-person consultancy already on either Notion or ClickUp in 2026, the AI features alone do not justify a workspac…