Enterprise AI infrastructure vendors: the 2026 SLA and uptime comparison matrix
The agentic AI architecture piece on SLA design is the customer-side specification; the SLAs the major infrastructure vendors actually post are the supply-side reality. The 2026 buying-committee SLA comparison resolves on five dimensions (uptime commitment, latency commitment, support response tier, credit calculation, and exclusions list) and reveals the structural gap most agentic AI buying committees discover at year-two renewal: the headline 99.9% uptime is calculated against a denominator and an exclusions list that materially shifts the customer's effective availability.
Holding·reviewed27 May 2026·next+59dThe customer-side question “what does the SLA architecture for an agentic AI workflow look like” is treated at the publication’s SLA architecture piece, which has accumulated 48 Microsoft Copilot grounding citations against agentmodeai content in the three-month window ending 25 May 2026. The supply-side question — what do the major infrastructure vendors actually commit to in their published SLAs — is structurally different, and is the conversation the buying committee that has read the architecture piece arrives at next.
This piece is the supply-side companion. It walks the five major enterprise AI infrastructure vendors (AWS Bedrock, Microsoft Azure OpenAI Service, Google Vertex AI, OpenAI Enterprise, Anthropic Enterprise) against five comparison dimensions (uptime commitment, latency commitment, support response tier, credit calculation, exclusions list scope) and surfaces the structural gap most agentic AI buying committees discover at year-two renewal: the headline 99.9% uptime is calculated against a denominator and an exclusions list that materially shifts the customer’s effective availability.
Why the headline number undersells the buying-committee question
Three structural reasons the 99.9% comparison number is misleading on its own.
The denominator is rarely the customer’s actual usage minutes. Most vendor SLAs are calculated as a percentage of calendar minutes per month, with the exclusions reducing the calendar-minute denominator before the uptime percentage is computed. The customer whose workload pattern uses the service heavily during peak windows and sparsely during off-peak is comparing a headline number that was calculated against a smoother demand curve.
The exclusions list is heterogeneous across vendors. Scheduled maintenance (with 48-hour notice at the hyperscaler tier; typically a smaller window for the model-vendor tier) is the largest exclusion category. Force-majeure events and customer-caused outages are universally excluded. The categories that vary across vendors are content-policy-enforcement actions (Azure OpenAI in particular treats content-filter behaviour as SLA-compliant in a way the buying committee should read explicitly), capacity constraints during peak periods (Azure OpenAI’s Provisioned Throughput Units carry separate commitments), and partial-availability events (where some functions are degraded but the headline service is technically up).
The credit calculation has materially different practical value. Credit applied against the monthly service charge for only the affected region is materially less valuable than credit applied across the vendor relationship; credit capped at a percentage is materially less valuable than credit uncapped or paid into a separate pool. The 2026 vendor pattern at the hyperscaler tier is regional capped credits at 10/25/100% tiers; the 2026 model-vendor pattern at the Enterprise tier is per-customer negotiated.
AWS Bedrock SLA, the publicly disclosed structure
The AWS Bedrock Service Level Agreement commits to 99.9% monthly uptime percentage for the Bedrock invocation API. The credit-tier structure is 10% at uptime between 99.0% and 99.9%, 25% between 95.0% and 99.0%, and 100% below 95.0%. The credit is applied against the monthly service charge for the affected Region, not the entire AWS bill.
The exclusions cover scheduled maintenance (announced 48 hours or more in advance), force majeure, customer-side issues (incorrect IAM configuration, exceeded service quotas, customer-caused throttling), and third-party-source outages. The AgentCore-specific SLA inherits the Bedrock SLA at the GA tier; model-specific commitments vary by region for the third-party models hosted on Bedrock (Anthropic, Meta, AI21, Stability AI, and the others). The buying committee should read the per-region availability commitment separately from the headline; a 99.9% commitment at the platform level can mask a materially lower availability for a specific model in a specific region.
The latency commitment is not in the public SLA. The buying committee that needs latency commitments for the agent’s tool-use round-trips must negotiate those into the MSA separately; the hyperscaler standard at the Enterprise tier is P95 latency targets per model per region.
Azure OpenAI Service SLA
Azure OpenAI Service inherits the Azure platform Service Level Agreement at 99.9% monthly uptime for the API service, with standard Azure credit tiers (10% at uptime below 99.9%, 25% below 99%, 100% below 95%). Provisioned Throughput Units (PTUs) carry a separate availability commitment that depends on the PTU tier and the region.
The exclusions include scheduled maintenance, capacity constraints during peak periods, customer-quota-exceeded events, and content-policy-enforcement actions. The content-policy behaviour is the row most often surprising at year-two renewal; the Azure content filters can block specific calls in ways the buying committee may perceive as service degradation, but which Azure treats as SLA-compliant by exclusion. The buying committee with use cases that span content-policy boundaries (legal-research agents, medical-information agents, regulated-content agents) should price this explicitly.
The dependency graph from the AM-175 platform comparison carries into the SLA conversation. An Azure OpenAI workload’s effective availability is bounded by the underlying Azure platform availability, the Entra identity-layer availability, and (for production workloads) the Azure Monitor observability-layer availability. The customer should aggregate these into a composite availability target rather than treating the OpenAI API SLA in isolation.
Google Vertex AI SLA
Google Vertex AI posts varying SLAs per generative AI model and per region. The typical commitment is 99.5% to 99.9% monthly uptime with the standard Google Cloud SLA credit structure (10/25/50% credit tiers depending on the breach severity).
The model-specific commitment matters because Gemini variants have different per-region availability. Gemini 1.5 Pro, Gemini 1.5 Flash, Gemini 2.0 Flash, Gemini 2.5 Pro, and Imagen have separate availability commitments that depend on the region the customer deploys in. The buying committee should price the model-and-region pair, not the platform headline; a customer running Gemini 2.5 Pro in a region where the SLA is 99.5% is on a different commitment than a customer running Gemini 1.5 Flash in a region where the SLA is 99.9%.
The exclusions follow the Google Cloud standard pattern: scheduled maintenance, force majeure, customer-side issues, third-party causes. The Vertex AI-specific exclusions add the “model is being updated” category (during model version transitions, the SLA can be paused), which the buying committee planning long-running production workloads should price.
OpenAI Enterprise SLA
OpenAI’s Enterprise tier carries a documented 99.9% to 99.99% uptime target in the customer’s MSA addendum. The OpenAI Status page at status.openai.com is the canonical historical record; the public-tier and Pro-tier services do not carry a contractual SLA.
The Enterprise SLA credit mechanism is delivered per-customer rather than via a published tier structure. The buying committee that has the procurement leverage to negotiate the Enterprise-tier MSA can typically get credit calculations comparable to the hyperscaler standard; the buying committee at smaller scale is structurally weaker here because the negotiation leverage that drives the credit terms is concentrated at the Fortune-500-and-above procurement tier.
The dependency graph for an OpenAI Enterprise workload runs through the OpenAI API surface plus the customer’s chosen identity-federation layer (typically Microsoft Entra or Okta) and the customer’s observability tooling. The composite availability the customer experiences is bounded by the weakest of these; the SLA conversation should price the composite, not just the OpenAI commitment.
Anthropic Enterprise SLA
Anthropic does not publish a uniform public SLA. Enterprise-tier customers receive per-customer commitments in the MSA; the Anthropic Status page at status.anthropic.com is the historical record.
The Enterprise commitments observable in public materials and procurement-team interactions in 2025-2026 cluster around 99.9% uptime targets with negotiated credit mechanisms, but the lack of a public uniform tier means the buying committee should treat each procurement as a fresh negotiation rather than as a tier selection. The dependency-graph framing matters here too; an Anthropic Enterprise workload’s composite availability depends on whether the customer runs Claude via the Anthropic API directly, via Amazon Bedrock, via Google Vertex AI, or via Microsoft Azure (Anthropic is available across multiple hyperscalers in 2026). Each deployment topology has a different composite SLA shape.
The five comparison dimensions, walked
The buying-committee output is a per-vendor matrix scored against the five dimensions. The table below is the 2026 reference shape; the customer fills the specific numbers per the vendor’s current public documentation and the customer’s negotiated MSA additions.
| Dimension | What to compare |
|---|---|
| Uptime commitment + denominator | Headline percentage; calendar-minute or usage-minute denominator; per-region vs platform-level scope; per-model commitment if applicable |
| Latency commitment | P95 and P99 latency targets the vendor commits to in the MSA; public SLAs typically do not include this; require it in the customer’s MSA addendum |
| Support response tier | Response-time commitment per severity level (P1 acknowledgement under 15 minutes is the enterprise default); the response-content commitment (acknowledgement vs initial diagnosis vs resolution timeline) |
| Credit calculation | What the credit applies against (affected-region monthly charge, cross-region, entire relationship); the cap on credit amount; whether credit is automatically applied or customer must claim |
| Exclusions list scope | Categories of events the vendor treats as SLA-compliant: scheduled maintenance window length, content-policy actions, capacity constraints, partial-availability events, third-party-source outages |
The 2026 buying-committee discipline is to populate this matrix at vendor short-list, not at contract negotiation. The MSA addendum work is materially easier when the customer arrives at the negotiation with the matrix in hand than when the matrix is discovered through year-one operational experience.
What this means for the Q3 2026 SLA-aware procurement agenda
Three workstreams operationally tractable in the procurement cycle.
The first is the composite-availability calculation. The customer aggregates the vendor SLA against the upstream dependencies (identity, observability, model, region) to produce the composite availability the customer’s workload actually experiences. The customer’s existing observability tooling can populate this from historical data if the customer is already running the workload; the customer at procurement-time uses the vendor’s public availability data plus the upstream dependency SLAs to model the composite.
The second is the latency commitment negotiation. The P95 and P99 latency targets the vendor will commit to in the MSA addendum are materially more useful for an agentic AI workload than the headline uptime number. The buying committee should require P95 and P99 commitments per model and per region; the procurement counsel should write these as binding obligations with a separate credit mechanism.
The third is the exclusion-scope review. The customer’s compliance, legal, and operations functions read the SLA’s exclusions list against the customer’s specific workload pattern and identify the rows where the customer’s perceived availability could be degraded while the vendor remains SLA-compliant. The output is the SLA-addendum redline that closes those rows or prices them explicitly.
The sibling AM-174 security-platform TCO/ROI piece covers the cost-side calculation that the SLA conversation feeds into. The AM-167 NHI procurement clause work covers the contract-side instruments. The customer-side SLA architecture piece covers the design pattern this supply-side matrix is the companion to. Together the four describe the SLA conversation the 2026 buying committee needs to have at procurement, not at year-two renewal.
Cite this article
Pick a citation format. Click to copy.
Spotted an error? See corrections policy →
Reasoned disagreement is a first-class signal here. Every review cycle weighs documented dissent; material dissent becomes part of the article's change history. This is not a corrections form — use /corrections/ for factual errors.
AI agent procurement →
The contracts, SLAs, and evaluation criteria that distinguish agentic-AI procurement from SaaS procurement. 35 other pieces in this pillar.