The Firefox 150 / Claude Mythos disclosure (November 2025) marks the operational shift in agentic AI code auditing from 'AI can find bugs' (true since 2023, but blocked from production CI by the false-positive rate that earlier read-only GPT-4 / Sonnet 3.5 attempts produced) to 'agentic verification clears the false-positive wall by building and running its own test cases before reporting'; the procurement-deck consequence is that CI-time agentic auditing becomes the default expectation for any shipping enterprise software in 2026, and three derived questions belong in any software-vendor procurement (does the vendor's CI pipeline include an agentic-auditing step; what is the vendor's disclosure posture when bugs are found in their own product by agentic tools; what is the vendor's posture on the dual-use risk that the same pipeline architecture works in reverse, as the reported Anthropic investigation of unauthorized Mythos use via a third-party vendor environment makes explicit).
Claim created at publish; review on 60-day cadence. Anchor sources: Mozilla Hacks blog post on Firefox 150 release (November 2025) covering the Claude Mythos Preview pipeline integration; Schneier on Security coverage of the disclosure; The Decoder coverage including the 15-year-old use-after-free in the <legend> element as the canonical combinatorial-reasoning anchor; SecurityWeek coverage including the Mozilla CTO calibration quote ('elite-human-quality discovery at machine throughput, not superhuman discovery'); CSO Online reporting on Anthropic investigation of unauthorized Mythos use via third-party vendor environment; flyingpenguin.com counter-narrative critique flagged in Schneier comments arguing the '271 zero-days' headline overstates the strict-zero-day count. Methodology caveat: the Firefox 150 release notes individually credit only 3 bugs as 'found with Claude' (two use-after-free, one invalid-pointer-in-wasm); the 271 total flows through rollup CVEs (CVE-2026-6784, 6785, 6786 totalling 316 internally-found bugs), so per-bug attribution at the public CVE level is much smaller than the aggregate. Sister claims: AM-146 (vendor accuracy claims need named task + baseline + methodology; agentic-verification step is the methodology change), AM-007 (vendor-response split for cross-agent class disclosure; the same Cohort A/B framing extends to defensive disclosure of agentic-auditing CI integration), AM-009 (Anthropic Cohort A disclosure pattern for Claude for Chrome; Mozilla's Mythos disclosure follows the same shape on the consuming-vendor side), AM-130 (procurement reader's four evidence classes; Mythos sits in the 'audited customer pilots with active human oversight' class given Mozilla's published methodology), AM-140 (procurement-committee six pre-pilot questions). Trigger conditions to revisit before next cadence: (a) a major enterprise software vendor (Microsoft, Google, AWS, Salesforce, Adobe, etc.) publishes an analogous CI-time agentic-auditing disclosure with named pipeline and named bug counts — extends the named-success cohort and changes 'default expectation' framing materially; (b) a published reproduction of the Mozilla pipeline by an independent third party (academic team, security-research firm) confirming or qualifying the false-positive-wall-falls finding; (c) a public disclosure by Anthropic concluding the unauthorized-Mythos-use investigation, with concrete remediation; (d) the flyingpenguin.com strict-zero-day critique gains traction in security-research literature and reframes the disclosure scope; (e) regulatory action (EU AI Act post-market monitoring, US FTC, sectoral regulator) imposing mandatory agentic-auditing CI requirements on shipping software.
/holding/AM-147/Embed this claimiframe + oEmbed
The card auto-updates when the claim's status, last-reviewed date, or correction log changes. Embedders never need to refresh — the card is rendered live from the canonical record.