Skip to content
Method: every claim tracked, reviewed every 30–90 days, marked Holding, Partial, or Not holding. Drafted by Claude; signed off by Peter. How this works →
OPS-083pub29 May 2026rev29 May 2026read3 mininOperators

AI Got Cheaper. Your AI Bill Is About to Go Up.

Two things are true at once. The price of raw AI inference is falling fast, with DeepSeek's latest models making a roughly 75% discount permanent. At the same time, the AI bills small businesses actually pay are climbing, because the cost is moving from the model to the layer where you run it. A billing change Anthropic has set for 15 Jun 2026 is the next trap. If you run AI inside automations, re-model your stack before the cutover.

Holding·reviewed29 May 2026·next+29d

Two things are true at once right now, and the gap between them is where small-business AI budgets get wrecked. The price of raw AI inference is falling fast: DeepSeek made a roughly 75% discount permanent on its latest models in late May, putting input as low as around $0.14 per million tokens on the cheaper tier. At the same time, the AI bills small businesses actually pay are climbing, because the cost is moving from the model to the layer where you run it. The clearest example has a date on it: on 15 Jun 2026 Anthropic has announced it will split its Claude subscriptions so that automation and SDK usage no longer come out of the same pool as chat, with separate monthly agent credits and API rates beyond them.

If you only chat with AI, it is getting cheaper. If you run AI inside automations, your bill is about to climb. Most small businesses are quietly in the second group and budgeting like the first.

The split-screen

The cheaper-model headlines are real. Per-token prices have fallen hard across providers, and the cheapest capable models now cost a small fraction of premium-model pricing (DeepSeek pricing). For anyone running AI on volume, support replies, content drafts, categorization, that genuinely resets the math, and caching repeated prompts pushes it lower still.

But the per-token price is not where your spend lives once AI is wired into your operations. It lives in the automation layer: the SDK calls and agent runs that hit the model many times per task, metered by the tool you built the automation in. That is the number that is moving the wrong way.

Where your cost actually moved

Three changes in this window all push the same direction. Anthropic’s 15 Jun split carves programmatic usage out of the flat subscription into its own metered pool. OpenAI set credit-based pricing for Workspace Agents, the custom-GPT successor, with the paid transition first slated for early May and since pushed to 6 Jul 2026, so preview workflows meter once it lands. And credit-based automation platforms like Make charge a native AI step several credits rather than the one credit a plain action costs, so a scenario that got smarter by adding AI quietly burns through its allowance faster (Make pricing).

None of these is a price gouge. They are the same correction: providers separating the cheap thing you do occasionally (chat) from the expensive thing you do constantly (automation), and pricing the second one for what it costs them to serve.

The 15 June trap

The Anthropic change is the one with a deadline, so it is the one to act on first. A small business running Claude inside an n8n or Zapier automation on a flat Max plan has been drawing automation usage from the same pool as its chat. After 15 Jun 2026 that usage moves to a separate agent-credit allowance with API rates beyond it. If your automations are busy, the bill steps up. The change actually re-enables third-party Agent SDK usage on subscription plans that an earlier update had pulled, but it meters it; either way, your automation usage is now counted separately from your chat.

Confirm the specific credit amounts and your own plan against Anthropic’s current pricing before you commit to any number; pricing pages move, and the load-bearing facts here are the change and its date, not a particular dollar figure. The point is to know, before the cutover, which column your usage is in.

This week

Run the thirty-minute routine in the box above: split your AI use into chat and automation, find what becomes metered on 15 Jun, estimate your monthly automation volume, and decide whether to route the high-volume work to a cheaper model, cap it, or absorb it. Route customer data deliberately, because the cheapest provider is not always the one you want holding it. Do this before the cutover and the June bill is a confirmation of a decision you made. Skip it and the bill makes the decision for you.

ShareX / TwitterLinkedInEmail

Spotted an error? See corrections policy →

Related reading

OPS-LEDGER · 43 reviewed