[bmdpat]
All writing
5 min read

Token budget wars are starting. Most companies are paying for vibes.

AI billing is shifting from seats and tokens to outcomes. If you cannot tie an agent run to a dollar of work, you are paying for vibes.

Share LinkedIn

Token budget wars are starting. Most companies are paying for vibes.

A thesis I keep hitting in builder threads this spring: AI billing is moving from "are people using it" to "did this agent finish a real piece of work, and what did it cost." Cost per resolved ticket. Cost per processed claim. Cost per reviewed contract. Cost per dollar of revenue moved.

That is a real shift. It changes what you sell, what you buy, and how you run agents inside your own business.

Why seat pricing and token pricing both break

Seat pricing assumes a human is at the keyboard. Agents do not sit at desks. One agent can do the work of ten seats on a good week and zero seats on a bad one. Seat pricing makes no sense for that.

Token pricing is what most vendors fell back to. It is closer, but the buyer hates it. Tokens are an input. The buyer is not buying tokens. The buyer is buying outcomes. When a vendor charges by tokens, the incentive is to use more tokens. The buyer is the one eating the bill while the vendor's margin grows on inefficiency.

Outcome pricing is where this lands. Per ticket resolved. Per email answered without escalation. Per invoice posted. Per qualified lead. The unit is a thing the buyer already counts.

What this means if you run agents inside your business

If you are a small operator running agents to do real work, you should already know two numbers per agent:

  1. Cost per run (tokens, tool calls, infra, every dollar that moves).
  2. Outcome per run (did it close the ticket, post the invoice, write the draft that shipped).

Most teams know neither. They look at the monthly Anthropic or OpenAI bill, swallow hard, and move on. That is paying for vibes.

The fix is boring. Tag every run with the outcome it was trying to produce. Cap the spend per outcome. Kill runs that go over. Report cost per completed goal each week. Once you have that number, you can price your own service on it, defend the price to a customer, and find the agents that quietly burn cash.

What this means if you sell to other companies

Buyers in 2026 will not sign a six-figure contract for "an AI agent that helps with X." They will ask what the unit is, what one unit costs, and what happens when the agent fails.

If you cannot answer those three, your competitor will. The vendors that win through 2027 are the ones who price per outcome, not per seat. With a cap. With a refund or retry on failure.

This is harder. It puts risk on the vendor instead of the buyer. That is the point. The buyer has been eating the risk for two years. The market is correcting.

The honest version

I run a small business and I run agents inside it. I built AgentGuard because I needed a way to know what each agent run was costing me and to stop the ones that blew their budget. It is a Python package. It is open source. It sits between your code and the model and enforces a budget, a token cap, and a rate limit.

If you are running agents in production and you do not have that kind of cap, you are one bad prompt away from a five-figure surprise. The token budget wars are not coming for the big labs first. They are coming for the operators who do not measure.

Try AgentGuard

Want more like this?

AI agent builds, real costs, what works. M-F only when there is something worth sending. No fluff.

PH

Patrick Hughes

Building BMD HODL — a one-person AI-operated holding company. Nashville, Tennessee. Twenty-Two agents.

More writing