If models are getting cheaper, why is my bill going up?

Because the cheaper number is the price per token of the model, and that is not where most small-business AI spend now sits. The spend sits in the automation layer: the SDK calls, agent runs, and credit-metered workflow steps that call the model many times per task. As providers separate that programmatic usage into its own metered pool, the effective price you pay rises even as the per-token model price falls. If you only chat with AI you feel the cheaper price; if you run AI inside automations you feel the rising bill.

What exactly changes on 15 Jun 2026?

Anthropic has announced it will split Claude subscriptions so that automation and SDK usage no longer draw from the same pool as chat, introducing separate monthly agent credits with API rates beyond them (https://support.claude.com/en/articles/15036540-use-the-claude-agent-sdk-with-your-claude-plan). The practical effect for a small business running Claude inside an n8n, Make, or Zapier automation is that usage which felt included on a flat subscription becomes metered. Confirm the exact credit amounts and your own plan against Anthropic's current pricing before you rely on any specific number; the change and its date are the load-bearing facts here.

Should I just switch to the cheapest model, like DeepSeek?

Cheaper models genuinely reset the unit economics for high-volume, lower-stakes work like categorization, drafting, and support triage, and for caching repeated prompts. But route customer or business data deliberately: a much cheaper provider can carry data-residency and trust trade-offs (DeepSeek is a Chinese provider), so match the model to the sensitivity of the task rather than to the price alone. The honest split is cheap model for high-volume internal work, trusted model for anything touching customer data or your brand.

How does this article track its own claim?

Claim OPS-083 in the Holding-up ledger (/holding/?claim=OPS-083), 30-day review on 28 Jun 2026, deliberately just after the 15 Jun cutover so the claim can be checked against what actually changed. Trigger conditions: (1) Anthropic alters or delays the 15 Jun billing change; (2) a major automation platform changes how it meters AI steps; (3) a model price move large enough to reset small-business unit economics again. The cost-discipline fundamentals are in the bootstrapped-SaaS piece (/operators/ai-cost-discipline-bootstrapped-saas/).

Why your AI bill is rising as models get cheaper

AI Got Cheaper. Your AI Bill Is About to Go Up.

Two things are true at once. The price of raw AI inference is falling fast, with DeepSeek's latest models making a roughly 75% discount permanent. At the same time, the AI bills small businesses actually pay are climbing, because the cost is moving from the model to the layer where you run it. A billing change Anthropic has set for 15 Jun 2026 is the next trap. If you run AI inside automations, re-model your stack before the cutover.

Written by ClaudeCurated and signed by Peter29 May 20263 min read

How this is written

Holding·reviewed29 May 2026·next+10d

Two things are true at once right now, and the gap between them is where small-business AI budgets get wrecked. The price of raw AI inference is falling fast: DeepSeek made a roughly 75% discount permanent on its latest models in late May, putting input as low as around $0.14 per million tokens on the cheaper tier. At the same time, the AI bills small businesses actually pay are climbing, because the cost is moving from the model to the layer where you run it. The clearest example has a date on it: on 15 Jun 2026 Anthropic has announced it will split its Claude subscriptions so that automation and SDK usage no longer come out of the same pool as chat, with separate monthly agent credits and API rates beyond them.

If you only chat with AI, it is getting cheaper. If you run AI inside automations, your bill is about to climb. Most small businesses are quietly in the second group and budgeting like the first.

The split-screen

The cheaper-model headlines are real. Per-token prices have fallen hard across providers, and the cheapest capable models now cost a small fraction of premium-model pricing (DeepSeek pricing). For anyone running AI on volume, support replies, content drafts, categorization, that genuinely resets the math, and caching repeated prompts pushes it lower still.

But the per-token price is not where your spend lives once AI is wired into your operations. It lives in the automation layer: the SDK calls and agent runs that hit the model many times per task, metered by the tool you built the automation in. That is the number that is moving the wrong way.

Where your cost actually moved

Three changes in this window all push the same direction. Anthropic’s 15 Jun split carves programmatic usage out of the flat subscription into its own metered pool. OpenAI set credit-based pricing for Workspace Agents, the custom-GPT successor, with the paid transition first slated for early May and since pushed to 6 Jul 2026, so preview workflows meter once it lands. And credit-based automation platforms like Make charge a native AI step several credits rather than the one credit a plain action costs, so a scenario that got smarter by adding AI quietly burns through its allowance faster (Make pricing).

None of these is a price gouge. They are the same correction: providers separating the cheap thing you do occasionally (chat) from the expensive thing you do constantly (automation), and pricing the second one for what it costs them to serve.

The 15 June trap

The Anthropic change is the one with a deadline, so it is the one to act on first. A small business running Claude inside an n8n or Zapier automation on a flat Max plan has been drawing automation usage from the same pool as its chat. After 15 Jun 2026 that usage moves to a separate agent-credit allowance with API rates beyond it. If your automations are busy, the bill steps up. The change actually re-enables third-party Agent SDK usage on subscription plans that an earlier update had pulled, but it meters it; either way, your automation usage is now counted separately from your chat.

Confirm the specific credit amounts and your own plan against Anthropic’s current pricing before you commit to any number; pricing pages move, and the load-bearing facts here are the change and its date, not a particular dollar figure. The point is to know, before the cutover, which column your usage is in.

This week

Run the thirty-minute routine in the box above: split your AI use into chat and automation, find what becomes metered on 15 Jun, estimate your monthly automation volume, and decide whether to route the high-volume work to a cheaper model, cap it, or absorb it. Route customer data deliberately, because the cheapest provider is not always the one you want holding it. Do this before the cutover and the June bill is a confirmation of a decision you made. Skip it and the bill makes the decision for you.

AI Got Cheaper. Your AI Bill Is About to Go Up.

The split-screen

Where your cost actually moved

The 15 June trap

This week

AI cost and unit economics →

Related reading

The split-screen

Where your cost actually moved

The 15 June trap

This week

AI cost and unit economics →

Related reading

Outcome-based AI pricing has reached small-business tools: the math you now have to do

Zapier MCP: every AI tool call costs two tasks, not one

Google's $100 AI Ultra: who it's actually for

AI-written analysis, signed by a practitioner. One or two pieces a week.

AI-written analysis, signed by a practitioner. One or two pieces a week.