What does the EU AI Act actually require by 2 August 2026?

The 2 August 2026 deadline activates the obligations in Articles 9 through 49 of the EU AI Act for high-risk AI systems referred to in Annex III. The operative requirements are: a documented risk-management system covering the full lifecycle (Article 9), data-governance obligations on training and validation data (Article 10), technical documentation a regulator can read (Article 11), automated event logging (Article 12), transparency to deployers (Article 13), human oversight architecture (Article 14), accuracy and cybersecurity baseline (Article 15), and a quality-management system across the provider organisation (Article 17). Penalties for non-compliance reach €15 million or 3% of global annual turnover, whichever is higher; prohibited-practice violations reach €35 million or 7%.

Which agentic AI deployments are classified as high-risk under Annex III?

Annex III names eight categories: biometric identification, critical infrastructure management, education and vocational training, employment and worker management (hiring and recruitment algorithms), access to essential private and public services (credit scoring, public-benefits decisions), law enforcement, migration and border control, and administration of justice and democratic processes. An agentic AI deployment falls in scope if it makes, materially supports, or substantially influences decisions in any of these categories. The 'materially supports or substantially influences' threshold is the one most enterprise governance teams misread.

Does the EU AI Act apply to non-EU enterprises?

Yes, with broad extraterritorial reach. The Act applies to any provider placing an AI system on the EU market, any deployer using an AI system in the Union, and critically, any provider or deployer outside the EU whose AI system's output is used in the Union. A US company running an agentic AI deployment whose outputs influence EU customers, employees, or operations is in scope. The structural test is jurisdiction of the affected natural persons, not jurisdiction of the deploying enterprise.

What does Article 14 'human oversight' actually require beyond having a human in the loop?

Article 14 requires that natural persons assigned to oversight can: properly understand the system's capacities and limitations; monitor operation to detect anomalies, dysfunctions, and unexpected performance; remain aware of automation bias (over-reliance on outputs); correctly interpret outputs; decide not to use the system or override its output; and intervene to interrupt operation through a stop button or comparable procedure. For high-risk biometric identification (Annex III §1(a)), no decision can be taken on the system's output unless that output is separately verified by at least two competent natural persons. 'Human in the loop' as commonly deployed, where a reviewer occasionally accepts or rejects an output, typically does not satisfy these requirements.

How does the EU AI Act overlap with NIS2 and GDPR Article 33?

Three frameworks impose adjacent reporting obligations on a single agentic AI incident. GDPR Article 33 requires breach notification to the supervisory authority within 72 hours where personal data is implicated. NIS2 imposes incident-reporting obligations on essential and important entities for incidents with significant impact, with a 24-hour early warning and 72-hour formal notification. The EU AI Act adds Article 73 reporting of serious incidents involving high-risk AI systems to the relevant market-surveillance authority. The same incident can trigger all three. Compliance evidence prepared for one framework is rarely sufficient for the others; the integrated reporting template is the under-built artifact.

What evidence-of-action production does the Act actually require, and where do most enterprises fall short?

Articles 12 and 17 together require: automated event logging traceable to specific outputs over the system's lifetime; a quality-management system maintained by the provider covering pre-market and post-market monitoring; technical documentation sufficient for a national competent authority to assess conformity; and corrective-action records when non-conformity is detected. Most enterprise agentic AI deployments do not generate this evidence by default. Logs exist for operational debugging but not for regulatory traceability. Quality-management evidence exists for the underlying SaaS contract but not for the agent's specific behaviour. The evidence layer is the most common compliance gap; building it post-hoc, after a regulator request, takes 6 to 12 weeks of forensic engineering at typical cost. The MTTD-for-Agents framework at /mttd/ is the detection-evidence layer that maps onto Article 12.

The EU AI Act and agentic AI: what August 2026 actually requires

At a glance

Claim

The EU AI Act enforcement deadline of 2 August 2026 applies high-risk-system obligations under Articles 9 through 49 to most enterprise agentic AI deployments operating in EU jurisdiction or providing services to EU nationals — not only to deployments explicitly classified within the Annex III high-risk categories. The compliance gap most enterprises face is structural: the Act requires evidence-of-action production (logs, oversight records, post-market monitoring, incident reports) that most agentic deployments do not generate by default. Building the evidence layer post-hoc, after a regulator request, is the failure mode.

Supporting figure

Over half of organisations lack systematic inventories of AI systems in production

Date

25 Apr 2026

Verdict

Holding(AM-035)

Next review

24 Jun 2026(+59d)

The EU AI Act enforcement deadline of 2 August 2026 is roughly fourteen weeks away as of this writing. Most enterprise governance teams reading the Act have classified their agentic AI deployments against the Annex III high-risk categories, found that the deployment is not explicitly named, and concluded the operational scope does not reach them. That reading is incorrect for most deployments most of the time, and the cost of discovering this in October after a market-surveillance authority request is materially higher than the cost of discovering it now.

This piece is the operational translation of the Act for enterprise IT. What the Act actually requires from a typical agentic AI deployment, where most enterprises have evidence-production gaps, and how to close them in the fourteen weeks before enforcement begins. The legal-firm boilerplate is competent for what it covers; what is missing in the enterprise-IT register is the mapping between the Articles and what an agentic deployment looks like in production.

Two propositions structure the piece:

The operational scope is broader than the Annex III list suggests. The Act binds any deployment that makes, materially supports, or substantially influences decisions in an Annex III category, with extraterritorial reach via Article 2. The “materially supports or substantially influences” threshold is the one most enterprise governance teams misread, and it usually catches deployments that internal classification has marked out-of-scope.
The compliance gap most enterprises face is structural, not technical. The Act requires evidence-of-action production (automated logs, quality-management records, oversight documentation, post-market monitoring) that most agentic deployments do not generate by default. Building the evidence layer post-hoc, after a regulator request, is the failure mode. The cost is six to twelve weeks of forensic engineering; the alternative is a finding of non-conformity carrying penalties up to €15 million or 3% of global turnover.

The remainder of the piece is the four-mode walkthrough: what the Act actually says, what the operational gap looks like in production, how the obligations map onto the GAUGE governance dimensions Peter and Claude have published before, and the four-step preparation track for enterprise IT teams running fourteen weeks short of enforcement.

What activates 2 August 2026

The Act’s enforcement is phased, and the phasing matters because some obligations are already live and some still ahead. The dates that bind enterprise IT in 2026 (European Commission, AI Act regulatory framework; artificialintelligenceact.eu, Implementation timeline):

2 February 2025: Articles 1–5 entered into force. Prohibited AI practices became illegal. Most enterprise agentic deployments were never within these prohibited categories; the deadline passed without enterprise-IT action in most cases.
2 August 2025: Articles 53 and 55, the general-purpose AI model obligations, activated for foundation-model providers. Anthropic, OpenAI, Google, Microsoft, and others began compliance preparation against these articles. Most enterprises did not need to act because the obligation falls on providers, not deployers.
2 August 2026: Articles 6–49, the high-risk AI system obligations, activate for all deployments meeting the Annex III scope. This is the deadline that binds enterprise IT for the first time and requires substantive preparation.
2 August 2027: Article 6(1) provisions tied to product safety legislation activate. Affects fewer enterprise agentic AI deployments directly; matters more for AI embedded in regulated products.

The 2 August 2026 deadline is the enforcement window that opens for most enterprise agentic AI. Penalties carry teeth: up to €15 million or 3% of global annual turnover for non-compliance with operational requirements, up to €35 million or 7% for prohibited-practice violations, and up to €7.5 million or 1% for incorrect or misleading information supplied to authorities (artificialintelligenceact.eu, Article 99 penalty regime).

What “high-risk” actually means

Annex III names eight categories of high-risk AI systems (artificialintelligenceact.eu, Annex III):

Biometric identification and categorisation of natural persons.
Critical infrastructure management. Water, gas, electricity, road traffic, digital infrastructure.
Education and vocational training. Admissions, evaluation of learning outcomes, monitoring during tests.
Employment, worker management, and access to self-employment. Hiring algorithms, screening, performance evaluation, allocation of tasks.
Access to essential private and public services and benefits. Credit scoring, eligibility for public benefits, emergency-response dispatching, life and health insurance pricing.
Law enforcement. Risk assessment, polygraphs, evaluation of evidence reliability, profiling.
Migration, asylum, and border control. Risk assessment, application processing, identity verification.
Administration of justice and democratic processes. Assistance with research and interpretation of facts and law, election influence.

Most enterprise governance teams read this list, fail to find an explicit match for their HR copilot or customer-support agent or developer-productivity tool, and conclude the deployment is out of scope. The misreading is in the second sentence of Article 6(2): a system is high-risk where it falls within Annex III, regardless of whether the deployer is the same entity as the provider. The threshold for falling within Annex III is broader than naming a system that is one of the eight categories. It includes any system whose output materially supports or substantially influences a decision in those categories.

A concrete pattern: an HR-facing agentic AI that summarises candidate CVs and surfaces “top fit” recommendations is not classified by its vendor as a hiring algorithm. The vendor sells it as productivity tooling. In production, the recommendations are read by hiring managers who use them to triage which candidates progress to interview. The system materially supports a hiring decision. It is in scope under Annex III §4.

The same pattern recurs across functions. A customer-service agent that scores customer requests for priority routing materially supports a service-access decision (Annex III §5). A code-review agent that approves or rejects pull requests in a critical-infrastructure code base materially supports an infrastructure-management decision (Annex III §2). The materiality threshold is operational, not nominal.

Article 14 human oversight: what it actually requires

Article 14 is the obligation enterprise governance teams most often misread because the phrase “human oversight” sounds resolved by an existing approval workflow. The Article specifies six operational requirements that “human in the loop” as commonly deployed does not satisfy (artificialintelligenceact.eu, Article 14):

The natural persons assigned to oversight must be enabled to:

Properly understand the relevant capacities and limitations of the system. Implies documented training and reference materials, not “the team has used it for six months.”
Duly monitor operation, including in view of detecting and addressing anomalies, dysfunctions, and unexpected performance. Implies instrumented monitoring with detection thresholds, not opportunistic review.
Remain aware of the possible tendency of automatically relying or over-relying on the output (automation bias). Implies trained awareness, ideally measured.
Correctly interpret the system’s output. Implies the output is interpretable, with documentation of how interpretation should proceed in edge cases.
Decide not to use the high-risk AI system or otherwise disregard, override, or reverse the output. Implies the override authority is documented, granted, and exercisable in practice, not just present in policy.
Intervene in the operation of the high-risk AI system or interrupt the system through a “stop” button or a similar procedure. Implies an operational kill switch with documented response time.

A reviewer who scrolls through agent-generated outputs and accepts most of them is not Article-14-compliant oversight, even if the role is titled “AI Reviewer” in the org chart. The compliance gap is between what the role does in practice and what the Article requires it to be enabled to do. Closing the gap is mostly evidence work. The reviewer has the authority and the training; the documentation that they have them is missing.

For the special case of biometric identification under Annex III §1(a), Article 14(5) goes further: no action or decision based on the system’s identification can be taken unless that identification has been separately verified by at least two competent natural persons. The dual-verification requirement is not negotiable for in-scope biometric deployments.

The evidence-production gap

Articles 12 and 17 together specify the evidence layer most enterprise agentic deployments do not produce by default (artificialintelligenceact.eu, Article 12; Article 17).

Article 12, automated event logging. High-risk AI systems must technically allow for automatic recording of events (‘logs’) over the system’s lifetime. Logs must enable identification of situations that may result in the system presenting a risk or in a substantial modification, facilitate post-market monitoring, and enable monitoring of operation. The retention period is at least six months, longer if other applicable Union or national law requires.

Article 17, quality management system. Providers must put in place a quality management system that is documented in a systematic and orderly manner. The system must include a strategy for regulatory compliance, techniques for design and quality control, examination and verification, post-market monitoring, communication with national competent authorities, record-keeping, and resource management.

Most enterprise agentic AI deployments operate under one of three logging postures, none of which satisfy Article 12 by default:

Operational debug logging. Logs exist for engineering debugging, covering the agent’s tool calls, model latency, error rates. The data is sufficient for a Slack post-mortem after an outage but not for a regulator-readable lifecycle record. Retention is typically 14–30 days.
Vendor-side logging. The vendor (Anthropic, OpenAI, Microsoft) maintains logs at their layer. The deployer has access to API request logs but not to the agent’s reasoning, tool-use sequence, or output-decision logs in a regulator-readable form. Coverage is partial; retention is contractual.
Compliance-shaped logging. Logs are configured for SOC 2 / ISO 27001 evidence, covering access events, configuration changes, data flows. The shape is right for compliance but the content is wrong; the logs do not record the agent’s per-decision behaviour.

What Article 12 actually requires is a fourth logging posture: per-action behavioural logging traceable to specific outputs, retained for at least six months, in a format a national competent authority can read. None of the first three postures produces this without explicit engineering work.

Most enterprise teams first discover this gap in the week of a regulator request. The reconstruction effort, assembling lifecycle logs from operational debug plus vendor records plus compliance evidence, typically takes six to twelve weeks of forensic engineering. The cost compounds because the regulator’s clock does not stop while the reconstruction proceeds.

Mapping GAUGE to the EU AI Act Articles

The GAUGE framework, six instrumented governance dimensions, maps onto the EU AI Act obligations cleanly enough that scoring an agentic deployment on GAUGE produces the gap analysis Article 9 requires. The mapping (Peter’s GAUGE diagnostic at /gauge/ maintains the canonical version):

GAUGE dimension	EU AI Act Article	What the dimension scores
Governance maturity	Article 9, Article 17	Whether a documented risk-management system and quality-management system exist for the deployment, with named owners and review cadence
Threat model	Article 9, Article 15	Whether risks to health, safety, and fundamental rights have been identified and addressed; whether cybersecurity baseline is documented
ROI evidence	Article 13, Article 17	Whether transparency to deployers is meaningful (not just an EULA accept) and whether quality-management records exist
Change management	Article 14	Whether human oversight architecture meets the six operational requirements above, not just nominal “human in the loop” framing
Vendor lock-in	Article 11, Article 13	Whether technical documentation is sufficient for a regulator (typically not satisfied by vendor SaaS contracts alone)
Compliance posture	Article 12, Article 18, Article 73	Whether automated logging, record-keeping, and serious-incident reporting are operational, with the integrated NIS2 + GDPR reporting overlap addressed

A deployment scoring above 70 on GAUGE is materially closer to Article 9–17 compliance than one scoring below 50, regardless of whether the GAUGE scoring exercise was framed as compliance work. The discipline is the same.

The free GAUGE Excel diagnostic at /gauge/ runs the six-dimension scoring in 30–45 minutes for a single deployment. For enterprises facing the 2 August 2026 deadline, scoring the in-scope deployment portfolio with GAUGE is a defensible first compliance artifact. It identifies the lowest-scoring dimensions, which become the eight-week engineering plan to close gaps before the enforcement window opens.

What to do Monday

Fourteen weeks remain before 2 August 2026. The realistic preparation track is four weeks of compliance-readiness work followed by ten weeks of remediation engineering on the deployments that need it. The first four weeks are governance work, not engineering.

Week 1, inventory and classify. Walk every active agentic AI deployment in the enterprise. For each, document: the function, the data flows, the decision-influence surface, and the affected-person jurisdiction. Apply the Annex III scope test honestly: “materially supports or substantially influences a decision in an Annex III category.” Most enterprises will find 30–50% more in-scope deployments than the initial inventory suggests, often in HR, customer service, and developer productivity functions where the agent’s output drives downstream human decisions.

Week 2, score against Articles 9 through 17. For every in-scope deployment, score the six GAUGE dimensions with the surface owners in the room: governance lead, security, finance or business sponsor, the team using the agent, architecture, legal. The disagreements across functions surface the actual compliance gaps. Security usually scores threat model lower than the deployment team does, legal usually scores compliance posture lower than IT does. Capture the deltas; they are the work.

Week 3, triage. Rank the in-scope deployments by lowest-scored dimension. Article 14 (human oversight architecture) and Article 12 (automated logging) are the two most-commonly missed in 2026. Article 9 (the risk-management system itself) is the meta-document that ties the others together. Most enterprises do not have a documented risk-management system specifically for the agentic deployment, only the broader enterprise risk framework which lacks deployment-specific content. Triage assigns each deployment to ready, gap-fix, or pause/redesign track.

Week 4, build the integrated reporting template. For deployments on the ready and gap-fix tracks, the delivered artifact is an integrated incident-response template that satisfies Article 73 (EU AI Act serious-incident reporting), NIS2 (24-hour early warning, 72-hour formal notification), and GDPR Article 33 (72-hour breach notification) in one document per deployment. Pair it with MTTD-for-Agents detection-time targets. The framework’s 4-hour enterprise / 24-hour mid-market thresholds map onto Article 12 logging requirements and the NIS2 24-hour early-warning obligation simultaneously.

The remaining ten weeks (mid-May through end-July 2026) are deployment-specific gap-fix engineering on the lowest-scoring dimensions. Most enterprises with disciplined first-four-weeks governance work close the remaining gaps inside the ten-week engineering window. Most enterprises that defer the governance work to June run into the deadline.

The Holding-up note

The primary claim of this piece (that the August 2026 enforcement applies broader operational scope than typical Annex III readings suggest, and that the compliance gap most enterprises face is structural evidence-production rather than technical capability) is logged at AM-035 on the Holding-up ledger on a 60-day review cadence. Three kinds of evidence would move the verdict:

Commission delegated acts that further define Annex III categories or add new high-risk categories. The Commission has signalled iterative refinement; a delegated act narrowing the “materially supports” threshold would weaken the broader-scope reading. A delegated act extending Annex III in any direction would strengthen it.
First published EU enforcement actions against agentic AI deployments after 2 August 2026. The early enforcement pattern will reveal whether market-surveillance authorities prioritise broad scope or narrow technical compliance. Both outcomes are possible.
Member-State implementations that diverge on enforcement intensity. The Act’s penalty maxima are EU-wide; the application is national. Differences in how member states interpret “materially supports” will show up in the first batch of actions and shape compliance posture across the EU thereafter.

The next review of this claim is scheduled 24 June 2026. The August 2026 enforcement window opens within five weeks of the next review; revisions to the claim will follow that window’s first enforcement actions.

ShareX / Twitter LinkedIn Email

Spotted an error? See corrections policy →

Disagree with this piece?

Reasoned disagreement is a first-class signal here. Every review cycle weighs documented dissent; material dissent becomes part of the article's change history. This is not a corrections form — use /corrections/ for factual errors.

Part of the pillar

Agentic AI governance →

Governance frameworks, oversight patterns, and compliance postures for enterprise agentic-AI deployment. 11 other pieces in this pillar.

What activates 2 August 2026

What “high-risk” actually means

Article 14 human oversight: what it actually requires

The evidence-production gap

Mapping GAUGE to the EU AI Act Articles

What to do Monday

The Holding-up note

Score this governance picture on six instrumented dimensions.

Agentic AI governance →

Related reading

Agentic AI in financial services: five frameworks

The enterprise agentic AI governance playbook for 2026

Agentic AI got real in Q1 2026. Most enterprise charters were written for a different quarter.

AI-written analysis, signed by a practitioner. One or two pieces a week.