What Uber's $1,500/Developer AI Cap Tells You About Your Own Bill
Uber caps every employee at $1,500/month per AI coding tool. The real fix is a per-identity cap in code, not a policy memo.
Bloomberg reported this week that Uber now caps every employee at $1,500 per month, per AI coding tool. Simon Willison picked it up on June 3. The number matters less than the move. A Fortune 50 company with a real finance team just admitted it cannot predict what its developers spend on AI.
This is the follow-up to a story I wrote about in May. Uber burned its entire 2026 AI coding budget in four months, running at roughly 3x the naive projection. The $1,500 cap is the response. Burn first, cap second. That order tells you everything.
The math is looser than it sounds
Read the policy carefully. The cap is per tool, per employee. Not per person. Not per month total.
So one developer running Claude Code, Cursor, and Copilot can spend $1,500 on each before any limit fires. That is $4,500 a month, per head, inside policy. Multiply by a team. The org-level number is still wide open.
A per-tool cap is a speed bump, not a wall. It slows the worst offenders. It does not give you a real budget. Uber knows this. It is the best they could ship fast, and fast was the constraint.
What this means if you are not Uber
Here is the part that hits small shops and solo builders.
If a company with thousands of engineers and a dedicated FinOps function cannot forecast coding-agent spend, you cannot either. Not because you are bad at it. Because the spend is non-deterministic. An agent in a loop, a long context window, a retry storm, a contractor who left a script running over the weekend. None of that shows up until the bill does.
You probably have one of these problems right now:
- A $200 Claude Code subscription that quietly metered out and is now billing API rates.
- Three contractors sharing one API key with no per-person ceiling.
- An automation agent that retries on failure and occasionally retries forever.
A policy memo does not stop any of these. Uber's lesson is that the fix has to live at the call site, not in a spreadsheet. You want the limit enforced in code, before the request goes out, per identity.
What Uber's policy looks like as runtime code
This is the whole idea behind AgentGuard. It is the open-source primitive Uber is reinventing in-house, except you get it in three lines.
from agentguard import BudgetGuard guard = BudgetGuard(monthly_usd=1500) with guard: response = client.messages.create( model="claude-opus-4-8", max_tokens=1024, messages=[{"role": "user", "content": prompt}], )
When the identity behind that guard crosses $1,500 for the month, the next call raises instead of charging you. No memo. No quarterly surprise. The limit is a fact of the runtime, not a guideline somebody might ignore.
You can scope it per developer, per contractor, per agent, or per customer if you resell access. That is the difference between Uber's blunt per-tool cap and what you actually want: a per-identity ceiling that knows who is spending.
Why in-process beats a proxy
Most cost tools sit in front of your calls as a proxy or router. That works until it does not. A proxy adds a hop, a single point of failure, and another thing to operate. It also cannot see intent. It sees traffic.
AgentGuard runs in your process, around your client. It counts real spend against the identity making the call and stops before the request leaves. No extra infrastructure. No second bill to control your first bill.
Uber proved the demand. A hard per-identity dollar cap on AI tooling is now something the largest engineering orgs ship by hand. You do not have to build it from scratch.
The takeaway
Two cost-cap stories landed in 24 hours this week: Uber's $1,500 cap and Copilot moving to usage-based pricing. The direction is set. AI coding spend is going from flat-rate to metered, and metered means someone has to own the meter.
If you wait for the bill to tell you, you have already lost the month. Put the cap in the code.
AgentGuard is free and open source. Three lines to a hard budget that actually fires. Try it here.
Want more like this?
AI agent builds, real costs, what works. M-F only when there is something worth sending. No fluff.
Patrick Hughes
Building BMD HODL — a one-person AI-operated holding company. Nashville, Tennessee. Twenty-Two agents.
More writing
- 4 min
What GitHub Copilot Users Wish They Had a Week Ago
Copilot went usage-based and bills spiked. The fix is a runtime budget cap at the call site.
- 4 min
When Your Blog Repair Loop Fails 23 Times, Stop Repairing
My blog repair loop chewed on a stale draft for 23 mornings and reported "blocked" every time. The fix was not a smarter retry. It was a TTL and a heal path.
- 6 min
AI-powered hacking went industrial. Here's what changes if you run agents.
Google found the first AI-built zero-day in a planned mass-exploitation event. A builder's read on what changes for small operators running agents.